Catalog version access
Hello.
I do some very regular testing on HEAD and my scripts need to know if
the catalog version has changed to determine if it needs to pg_restore
or if a basebackup is okay. In order to get it, I have to do this:
# Get the catalog version (there is no better way to do this)
tmp=$(mktemp --directory)
$bin/initdb --pgdata=$tmp
catversion=$($bin/pg_controldata $tmp | grep "Catalog version" \
| cut --delimiter=: --fields=2 | xargs)
rm --recursive --force $tmp
I find this less than attractive, especially since the catalog version
is a property of the binaries and not the data directory. Attached is a
patchset so that the above can become simply:
catversion=$($bin/pg_config --catversion)
and a second patch that adds a read-only guc to get at it on the SQL
level using SHOW catalog_version; or similar. I need that because I
also do a dump of pg_settings and I would like for it to appear there.
Please consider.
--
Vik Fearing
Attachments:
0001-Add-catalog-version-to-pg_config.patchtext/x-patch; charset=UTF-8; name=0001-Add-catalog-version-to-pg_config.patchDownload
From a06fd975ef14930bbef2dac3597272289d6b10eb Mon Sep 17 00:00:00 2001
From: Vik Fearing <vik@postgresfriends.org>
Date: Fri, 13 Nov 2020 11:55:58 +0100
Subject: [PATCH 1/2] Add catalog version to pg_config
---
doc/src/sgml/ref/pg_config-ref.sgml | 9 +++++++++
src/bin/pg_config/pg_config.c | 2 ++
src/common/config_info.c | 7 ++++++-
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/ref/pg_config-ref.sgml b/doc/src/sgml/ref/pg_config-ref.sgml
index e177769188..826cd5bf07 100644
--- a/doc/src/sgml/ref/pg_config-ref.sgml
+++ b/doc/src/sgml/ref/pg_config-ref.sgml
@@ -268,6 +268,15 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--catversion</option></term>
+ <listitem>
+ <para>
+ Print the version number of <productname>PostgreSQL</productname>'s catalog.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>--version</option></term>
<listitem>
diff --git a/src/bin/pg_config/pg_config.c b/src/bin/pg_config/pg_config.c
index c40bb3282e..ca048f6eff 100644
--- a/src/bin/pg_config/pg_config.c
+++ b/src/bin/pg_config/pg_config.c
@@ -64,6 +64,7 @@ static const InfoItem info_items[] = {
{"--ldflags_sl", "LDFLAGS_SL"},
{"--libs", "LIBS"},
{"--version", "VERSION"},
+ {"--catversion", "CATALOG_VERSION"},
{NULL, NULL}
};
@@ -99,6 +100,7 @@ help(void)
printf(_(" --ldflags_ex show LDFLAGS_EX value used when PostgreSQL was built\n"));
printf(_(" --ldflags_sl show LDFLAGS_SL value used when PostgreSQL was built\n"));
printf(_(" --libs show LIBS value used when PostgreSQL was built\n"));
+ printf(_(" --catversion show the PostgreSQL catalog version\n"));
printf(_(" --version show the PostgreSQL version\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nWith no arguments, all known items are shown.\n\n"));
diff --git a/src/common/config_info.c b/src/common/config_info.c
index e72e7292a0..30c3168890 100644
--- a/src/common/config_info.c
+++ b/src/common/config_info.c
@@ -20,6 +20,7 @@
#include "postgres_fe.h"
#endif
+#include "catalog/catversion.h"
#include "common/config_info.h"
@@ -38,7 +39,7 @@ get_configdata(const char *my_exec_path, size_t *configdata_len)
int i = 0;
/* Adjust this to match the number of items filled below */
- *configdata_len = 23;
+ *configdata_len = 24;
configdata = (ConfigData *) palloc(*configdata_len * sizeof(ConfigData));
configdata[i].name = pstrdup("BINDIR");
@@ -191,6 +192,10 @@ get_configdata(const char *my_exec_path, size_t *configdata_len)
#endif
i++;
+ configdata[i].name = pstrdup("CATALOG_VERSION");
+ configdata[i].setting = pstrdup(psprintf("%d", CATALOG_VERSION_NO));
+ i++;
+
configdata[i].name = pstrdup("VERSION");
configdata[i].setting = pstrdup("PostgreSQL " PG_VERSION);
i++;
--
2.25.1
0002-Add-catalog_version-guc-accessible-from-SQL.patchtext/x-patch; charset=UTF-8; name=0002-Add-catalog_version-guc-accessible-from-SQL.patchDownload
From 05d0bca68096dba522075fa76ff1af490a050107 Mon Sep 17 00:00:00 2001
From: Vik Fearing <vik@postgresfriends.org>
Date: Fri, 13 Nov 2020 11:56:03 +0100
Subject: [PATCH 2/2] Add catalog_version guc, accessible from SQL
---
doc/src/sgml/config.sgml | 14 ++++++++++++++
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 14 ++++++++++++++
3 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e81141e45c..0b877b531c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -9725,6 +9725,20 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
</listitem>
</varlistentry>
+ <varlistentry id="guc-catalog-version" xreflabel="catalog_version">
+ <term><varname>catalog_version</varname> (<type>string</type>)
+ <indexterm>
+ <primary><varname>catalog_version</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Reports the catalog version of the server. It is determined by the
+ value of <literal>CATALOG_VERSION_NO</literal> when building the server.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-data-checksums" xreflabel="data_checksums">
<term><varname>data_checksums</varname> (<type>boolean</type>)
<indexterm>
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index b171ef0e4f..a45f7f1727 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -16,7 +16,7 @@
## if an option is valid but shows up in only one file (guc.c but not
## postgresql.conf.sample), it should be listed here so that it
## can be ignored
-INTENTIONALLY_NOT_INCLUDED="debug_deadlocks in_hot_standby \
+INTENTIONALLY_NOT_INCLUDED="catalog_version_num debug_deadlocks in_hot_standby \
is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 00018abb7d..a619a17922 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -37,6 +37,7 @@
#include "access/twophase.h"
#include "access/xact.h"
#include "access/xlog_internal.h"
+#include "catalog/catversion.h"
#include "catalog/namespace.h"
#include "catalog/pg_authid.h"
#include "catalog/storage.h"
@@ -598,6 +599,7 @@ static char *locale_ctype;
static char *server_encoding_string;
static char *server_version_string;
static int server_version_num;
+static int catalog_version_no;
static char *timezone_string;
static char *log_timezone_string;
static char *timezone_abbreviations_string;
@@ -3367,6 +3369,18 @@ static struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ /* Can't be set in postgresql.conf */
+ {"catalog_version", PGC_INTERNAL, PRESET_OPTIONS,
+ gettext_noop("Shows the catalog version number."),
+ NULL,
+ GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &catalog_version_no,
+ CATALOG_VERSION_NO, CATALOG_VERSION_NO, CATALOG_VERSION_NO,
+ NULL, NULL, NULL
+ },
+
{
{"log_temp_files", PGC_SUSET, LOGGING_WHAT,
gettext_noop("Log the use of temporary files larger than this number of kilobytes."),
--
2.25.1
Hi,
On 2021-02-22 00:15:20 +0100, Vik Fearing wrote:
I do some very regular testing on HEAD and my scripts need to know if
the catalog version has changed to determine if it needs to pg_restore
or if a basebackup is okay. In order to get it, I have to do this:# Get the catalog version (there is no better way to do this)
tmp=$(mktemp --directory)
$bin/initdb --pgdata=$tmp
catversion=$($bin/pg_controldata $tmp | grep "Catalog version" \
| cut --delimiter=: --fields=2 | xargs)
rm --recursive --force $tmp
That's a pretty heavy way to do it. If you have access to pg_config, you
could just do
grep '^#define CATALOG_VER' $(pg_config --includedir)/server/catalog/catversion.h|awk '{print $3}'
I find this less than attractive, especially since the catalog version
is a property of the binaries and not the data directory. Attached is a
patchset so that the above can become simply:catversion=$($bin/pg_config --catversion)
Seems roughly reasonable. Although I wonder if we rather should make it
something more generic than just catversion? E.g. a wal page magic bump
will also require a dump/restore or pg_upgrade, but won't be detected by
just doing this. So perhaps we should instead add a pg_config option
showing all the different versions that influence on-disk compatibility?
Greetings,
Andres Freund
On 2/22/21 12:48 AM, Andres Freund wrote:
Hi,
On 2021-02-22 00:15:20 +0100, Vik Fearing wrote:
I do some very regular testing on HEAD and my scripts need to know if
the catalog version has changed to determine if it needs to pg_restore
or if a basebackup is okay. In order to get it, I have to do this:# Get the catalog version (there is no better way to do this)
tmp=$(mktemp --directory)
$bin/initdb --pgdata=$tmp
catversion=$($bin/pg_controldata $tmp | grep "Catalog version" \
| cut --delimiter=: --fields=2 | xargs)
rm --recursive --force $tmpThat's a pretty heavy way to do it.
That's what I thought, too!
If you have access to pg_config, you
could just do
grep '^#define CATALOG_VER' $(pg_config --includedir)/server/catalog/catversion.h|awk '{print $3}'
Oh thanks. That's much better.
I find this less than attractive, especially since the catalog version
is a property of the binaries and not the data directory. Attached is a
patchset so that the above can become simply:catversion=$($bin/pg_config --catversion)
Seems roughly reasonable. Although I wonder if we rather should make it
something more generic than just catversion? E.g. a wal page magic bump
will also require a dump/restore or pg_upgrade, but won't be detected by
just doing this. So perhaps we should instead add a pg_config option
showing all the different versions that influence on-disk compatibility?
Do you mean one single thing somehow lumped together, or one for each
version number?
--
Vik Fearing
On Sun, Feb 21, 2021, at 8:15 PM, Vik Fearing wrote:
and a second patch that adds a read-only guc to get at it on the SQL
level using SHOW catalog_version; or similar. I need that because I
also do a dump of pg_settings and I would like for it to appear there.
The catalog version number is already available in pg_control_system().
postgres=# select * from pg_control_system();
-[ RECORD 1 ]------------+-----------------------
pg_control_version | 1300
catalog_version_no | 202007201
system_identifier | 6931867587550812316
pg_control_last_modified | 2021-02-21 20:59:06-03
--
Euler Taveira
EDB https://www.enterprisedb.com/
Vik Fearing <vik@postgresfriends.org> writes:
On 2/22/21 12:48 AM, Andres Freund wrote:
Seems roughly reasonable. Although I wonder if we rather should make it
something more generic than just catversion? E.g. a wal page magic bump
will also require a dump/restore or pg_upgrade, but won't be detected by
just doing this. So perhaps we should instead add a pg_config option
showing all the different versions that influence on-disk compatibility?
Do you mean one single thing somehow lumped together, or one for each
version number?
FWIW, I think asking pg_config about this is a guaranteed way of having
version-skew-like bugs. If we're going to bother with providing a way
to get this info, we should make it possible to ask the running server.
(That would open up some security questions: do we want to let
unprivileged users know this info? I guess if version() is not
protected then this needn't be either.)
regards, tom lane
Hi,
On 2021-02-21 19:54:01 -0500, Tom Lane wrote:
Vik Fearing <vik@postgresfriends.org> writes:
On 2/22/21 12:48 AM, Andres Freund wrote:
Seems roughly reasonable. Although I wonder if we rather should make it
something more generic than just catversion? E.g. a wal page magic bump
will also require a dump/restore or pg_upgrade, but won't be detected by
just doing this. So perhaps we should instead add a pg_config option
showing all the different versions that influence on-disk compatibility?Do you mean one single thing somehow lumped together, or one for each
version number?FWIW, I think asking pg_config about this is a guaranteed way of having
version-skew-like bugs.
Could you elaborate a bit?
If we're going to bother with providing a way
to get this info, we should make it possible to ask the running server.
In Vik's case there is no running server to ask, IIUC. He has an
existing cluster running an "old" set of binaries, and a set of
binaries. He wants to know if he needs to pg_upgrade, or can start from
a basebackup. The old version he can get from pg_controldata, or the
catalog. But except for initdb'ing a throw-away cluster, there's no way
to get that for the new cluster that doesn't involve grepping headers...
(That would open up some security questions: do we want to let
unprivileged users know this info? I guess if version() is not
protected then this needn't be either.)
I don't see a reason it'd need to be protected. Furthermore, the ship
has sailed:
SELECT catalog_version_no FROM pg_control_system();
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2021-02-21 19:54:01 -0500, Tom Lane wrote:
FWIW, I think asking pg_config about this is a guaranteed way of having
version-skew-like bugs.
Could you elaborate a bit?
How do you know that the pg_config you found has anything to do with the
server you're connected to?
If we're going to bother with providing a way
to get this info, we should make it possible to ask the running server.
In Vik's case there is no running server to ask, IIUC.
Hm. If you're about to initdb or start the server then there's more
reason to think you can find a matching pg_config. Still, pg_config
is not going to tell you what is actually in the data directory, so
it's not clear to me how it helps with "do I need to initdb?".
He has an
existing cluster running an "old" set of binaries, and a set of
binaries. He wants to know if he needs to pg_upgrade, or can start from
a basebackup. The old version he can get from pg_controldata, or the
catalog. But except for initdb'ing a throw-away cluster, there's no way
to get that for the new cluster that doesn't involve grepping headers...
For production cases it'd be sufficient to compare pg_config --version
output. I suppose if you want to automate this for development versions
you'd wish for something more fine-grained ... but I think you can
already use configure's --with-extra-version to stick catversion or
whatever into the --version string.
regards, tom lane
Hi,
On 2021-02-21 20:53:52 -0500, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
If we're going to bother with providing a way
to get this info, we should make it possible to ask the running server.In Vik's case there is no running server to ask, IIUC.
Hm. If you're about to initdb or start the server then there's more
reason to think you can find a matching pg_config. Still, pg_config
is not going to tell you what is actually in the data directory, so
it's not clear to me how it helps with "do I need to initdb?".
Imagine trying to run regular tests of HEAD, where the tests require a
large database to be loaded. Re-loading the data for every [few] commits
is prohibitively time consuming, and even just running pg_upgrade is
painful. So you'd like to re-use a "template" data directory with the
data loaded if possible (i.e. no catversion / WAL / ... version bumps),
and a pg_upgrade otherwise.
In such a situation it's easy to access the catalog version for the
existing data directory (pg_controldata, or pg_control_system()) - but
there's no convenient way to figure out what the catversion of the
to-be-tested version will be. Vik's approach to figuring that out was
initdb'ing a throw-away data directory, using pg_controldata, and
discarding that data directory - not pretty. There's no version skew
issue here as far as I can tell, given he's initdbing freshly anyway.
The only argument I see against such an option is that arguably just
grepping for the information in the headers isn't too hard. But that's
then something everybody has to do, there's the issue of plain unix
commands not working on windows, etc.
Greetings,
Andres Freund
On 2/22/21 3:24 AM, Andres Freund wrote:
Imagine trying to run regular tests of HEAD, where the tests require a
large database to be loaded. Re-loading the data for every [few] commits
is prohibitively time consuming, and even just running pg_upgrade is
painful. So you'd like to re-use a "template" data directory with the
data loaded if possible (i.e. no catversion / WAL / ... version bumps),
and a pg_upgrade otherwise.
This is exactly what I am doing.
--
Vik Fearing
On 22.02.21 08:00, Vik Fearing wrote:
On 2/22/21 3:24 AM, Andres Freund wrote:
Imagine trying to run regular tests of HEAD, where the tests require a
large database to be loaded. Re-loading the data for every [few] commits
is prohibitively time consuming, and even just running pg_upgrade is
painful. So you'd like to re-use a "template" data directory with the
data loaded if possible (i.e. no catversion / WAL / ... version bumps),
and a pg_upgrade otherwise.This is exactly what I am doing.
If what you want to know is whether a given binary can run against a
given data directory then CATALOG_VERSION_NO isn't the only thing you
need to check. The full truth of this is in ReadControlFile(). The
best way to get that answer is to start a server and see if it
complains. You can even grep the log for "It looks like you need to
initdb.".
On 3/3/21 6:35 PM, Peter Eisentraut wrote:
On 22.02.21 08:00, Vik Fearing wrote:
On 2/22/21 3:24 AM, Andres Freund wrote:
Imagine trying to run regular tests of HEAD, where the tests require a
large database to be loaded. Re-loading the data for every [few] commits
is prohibitively time consuming, and even just running pg_upgrade is
painful. So you'd like to re-use a "template" data directory with the
data loaded if possible (i.e. no catversion / WAL / ... version bumps),
and a pg_upgrade otherwise.This is exactly what I am doing.
If what you want to know is whether a given binary can run against a
given data directory then CATALOG_VERSION_NO isn't the only thing you
need to check. The full truth of this is in ReadControlFile(). The
best way to get that answer is to start a server and see if it
complains. You can even grep the log for "It looks like you need to
initdb.".
In that case, what would everyone think about a `pg_ctl check` option?
--
Vik Fearing
Vik Fearing <vik@postgresfriends.org> writes:
On 3/3/21 6:35 PM, Peter Eisentraut wrote:
If what you want to know is whether a given binary can run against a
given data directory then CATALOG_VERSION_NO isn't the only thing you
need to check. The full truth of this is in ReadControlFile(). The
best way to get that answer is to start a server and see if it
complains. You can even grep the log for "It looks like you need to
initdb.".
In that case, what would everyone think about a `pg_ctl check` option?
The trouble with Peter's recipe is that it doesn't work if there is
already a server instance running there (or at least I think we'll
bitch about the existing postmaster first, maybe I'm wrong). Now,
that's not such a big problem for the use-case you were describing.
But I bet if we expose this method as an apparently-general-purpose
pg_ctl option, there'll be complaints.
regards, tom lane
On Wed, Mar 3, 2021 at 8:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vik Fearing <vik@postgresfriends.org> writes:
On 3/3/21 6:35 PM, Peter Eisentraut wrote:
If what you want to know is whether a given binary can run against a
given data directory then CATALOG_VERSION_NO isn't the only thing you
need to check. The full truth of this is in ReadControlFile(). The
best way to get that answer is to start a server and see if it
complains. You can even grep the log for "It looks like you need to
initdb.".In that case, what would everyone think about a `pg_ctl check` option?
The trouble with Peter's recipe is that it doesn't work if there is
already a server instance running there (or at least I think we'll
bitch about the existing postmaster first, maybe I'm wrong). Now,
that's not such a big problem for the use-case you were describing.
But I bet if we expose this method as an apparently-general-purpose
pg_ctl option, there'll be complaints.
Another option could be to provide a switch to the postmaster binary.
Using pg_config as originally suggested is risky because you might
pick up the wrong postmaster, but if you put it on the actual
postmaster binary you certainly know which one you're on... As this is
something that's primarily of interest to developers, it's also a bit
lower weight than having a "heavy" solution like an entire mode for
pg_ctl.
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
On 4/8/21, 2:58 AM, "Magnus Hagander" <magnus@hagander.net> wrote:
Another option could be to provide a switch to the postmaster binary.
Using pg_config as originally suggested is risky because you might
pick up the wrong postmaster, but if you put it on the actual
postmaster binary you certainly know which one you're on... As this is
something that's primarily of interest to developers, it's also a bit
lower weight than having a "heavy" solution like an entire mode for
pg_ctl.
I was looking at the --check switch for the postgres binary recently
[0]: /messages/by-id/0545F7B3-70C0-4DE8-8C85-EAFE6113B7EE@amazon.com
In the attached patch, --check will also check the control file if one
exists.
Nathan
[0]: /messages/by-id/0545F7B3-70C0-4DE8-8C85-EAFE6113B7EE@amazon.com
Attachments:
v1-0001-Check-control-file-in-check-mode.patchapplication/octet-stream; name=v1-0001-Check-control-file-in-check-mode.patchDownload
From bb720f804c9c2c1d5457d6c56260f7f526822e0a Mon Sep 17 00:00:00 2001
From: Nathan Bossart <bossartn@amazon.com>
Date: Mon, 16 Aug 2021 18:07:20 +0000
Subject: [PATCH v1 1/1] Check control file in --check mode.
This enables developers to use --check mode to ensure that a given
binary can run against a given data directory. If no control file
exists, we skip this new check.
---
src/backend/bootstrap/bootstrap.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 48615c0ebc..9d27dc09a7 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -16,6 +16,7 @@
#include <unistd.h>
#include <signal.h>
+#include <sys/stat.h>
#include "access/genam.h"
#include "access/heapam.h"
@@ -319,6 +320,20 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
CreateDataDirLockFile(false);
+ /* if we see a control file in check_only mode, check it, too */
+ if (check_only)
+ {
+ struct stat buf;
+
+ if (stat(XLOG_CONTROL_FILE, &buf) == 0)
+ LocalProcessControlFile(false);
+ else if (errno != ENOENT)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m",
+ XLOG_CONTROL_FILE)));
+ }
+
SetProcessingMode(BootstrapProcessing);
IgnoreSystemIndexes = true;
--
2.16.6
On Mon, Aug 16, 2021 at 06:12:54PM +0000, Bossart, Nathan wrote:
I was looking at the --check switch for the postgres binary recently
[0], and this sounds like something that might fit in nicely there.
In the attached patch, --check will also check the control file if one
exists.
This would not work on a running postmaster as CreateDataDirLockFile()
is called beforehand, but we want this capability, no?
Abusing a bootstrap option for this purpose does not look like a good
idea, to be honest, especially for something only used internally now
and undocumented, but we want to use something aimed at an external
use with some documentation. Using a separate switch would be more
adapted IMO. Also, I think that we should be careful with the read of
the control file to avoid false positives? We can rely on an atomic
read/write thanks to its maximum size of 512 bytes, but this looks
like a lot what has been done recently with postgres -C for runtime
GUCs, that *require* a read of the control file before grabbing the
values we are interested in.
--
Michael
Thanks for taking a look!
On 1/23/22, 7:31 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
On Mon, Aug 16, 2021 at 06:12:54PM +0000, Bossart, Nathan wrote:
I was looking at the --check switch for the postgres binary recently
[0], and this sounds like something that might fit in nicely there.
In the attached patch, --check will also check the control file if one
exists.This would not work on a running postmaster as CreateDataDirLockFile()
is called beforehand, but we want this capability, no?
I was not under the impression this was a requirement, based on the
use-case discussed upthread [0]/messages/by-id/20210222022407.ecaygvx2ise6uwyz@alap3.anarazel.de.
Abusing a bootstrap option for this purpose does not look like a good
idea, to be honest, especially for something only used internally now
and undocumented, but we want to use something aimed at an external
use with some documentation. Using a separate switch would be more
adapted IMO.
This is the opposite of what Magnus proposed earlier [1]/messages/by-id/CABUevEySovaEDci7c0DXOrV6c7JzWqYzfVwOiRUJxMao=9seEw@mail.gmail.com. Do we need
a new pg_ctl option for this? It seems like it is really only
intended for use by PostgreSQL developers, but perhaps there are other
use-cases I am not thinking of. In any case, the pg_ctl option would
probably end up using --check (or another similar mode) behind the
scenes.
Also, I think that we should be careful with the read of
the control file to avoid false positives? We can rely on an atomic
read/write thanks to its maximum size of 512 bytes, but this looks
like a lot what has been done recently with postgres -C for runtime
GUCs, that *require* a read of the control file before grabbing the
values we are interested in.
Sorry, I'm not following this one. In my proposed patch, the control
file (if one exists) is read after CreateDataDirLockFile(), just like
PostmasterMain().
Nathan
[0]: /messages/by-id/20210222022407.ecaygvx2ise6uwyz@alap3.anarazel.de
[1]: /messages/by-id/CABUevEySovaEDci7c0DXOrV6c7JzWqYzfVwOiRUJxMao=9seEw@mail.gmail.com
On Mon, Jan 24, 2022 at 08:40:08PM +0000, Bossart, Nathan wrote:
On 1/23/22, 7:31 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
On Mon, Aug 16, 2021 at 06:12:54PM +0000, Bossart, Nathan wrote:
I was looking at the --check switch for the postgres binary recently
[0], and this sounds like something that might fit in nicely there.
In the attached patch, --check will also check the control file if one
exists.This would not work on a running postmaster as CreateDataDirLockFile()
is called beforehand, but we want this capability, no?I was not under the impression this was a requirement, based on the
use-case discussed upthread [0].
Hmm. I got a different impression as of this one:
/messages/by-id/3496407.1613955241@sss.pgh.pa.us
But I can see downthread that this is not the case. Sorry for the
confusion.
Abusing a bootstrap option for this purpose does not look like a good
idea, to be honest, especially for something only used internally now
and undocumented, but we want to use something aimed at an external
use with some documentation. Using a separate switch would be more
adapted IMO.This is the opposite of what Magnus proposed earlier [1]. Do we need
a new pg_ctl option for this? It seems like it is really only
intended for use by PostgreSQL developers, but perhaps there are other
use-cases I am not thinking of. In any case, the pg_ctl option would
probably end up using --check (or another similar mode) behind the
scenes.
Based on the latest state of the thread, I am understanding that we
don't want a new option for pg_ctl for this feature, and using a
bootstrap's --check for this purpose is not a good idea IMO. What I
guess from Magnus' suggestion would be to add a completely different
switch.
Once you remove the requirement of a running server, we have basically
what has been recently implemented with postgres -C for
runtime-computed GUCs, because we already go through a read of the
control file to be able to print those GUCs with their correct
values. This also means that it is already possible to check if a
data folder is compatible with a set of binaries with this facility,
as any postgres -C command with a runtime GUC would trigger this
check. Using any of the existing runtime GUCs may be confusing, but
that would work. And I am not really convinced that we have any need
to add a specific GUC for this purpose, be it a sort of
is_controlfile_valid or controlfile_checksum (CRC32 of the control
file).
Also, I think that we should be careful with the read of
the control file to avoid false positives? We can rely on an atomic
read/write thanks to its maximum size of 512 bytes, but this looks
like a lot what has been done recently with postgres -C for runtime
GUCs, that *require* a read of the control file before grabbing the
values we are interested in.Sorry, I'm not following this one. In my proposed patch, the control
file (if one exists) is read after CreateDataDirLockFile(), just like
PostmasterMain().
While looking at the patch, I was thinking about the fact that we may
want to support the case even if a server is running. If we don't, my
worries about the concurrent control file activities are moot.
--
Michael
On Tue, Jan 25, 2022 at 01:12:32PM +0900, Michael Paquier wrote:
Once you remove the requirement of a running server, we have basically
what has been recently implemented with postgres -C for
runtime-computed GUCs, because we already go through a read of the
control file to be able to print those GUCs with their correct
values. This also means that it is already possible to check if a
data folder is compatible with a set of binaries with this facility,
as any postgres -C command with a runtime GUC would trigger this
check. Using any of the existing runtime GUCs may be confusing, but
that would work. And I am not really convinced that we have any need
to add a specific GUC for this purpose, be it a sort of
is_controlfile_valid or controlfile_checksum (CRC32 of the control
file).
Thinking more about this one, we can already do that, so I have
marked the patch as RwF. Perhaps we could just add a GUC, but that
feels a bit dummy.
--
Michael
On Mon, Jan 31, 2022 at 04:57:13PM +0900, Michael Paquier wrote:
On Tue, Jan 25, 2022 at 01:12:32PM +0900, Michael Paquier wrote:
Once you remove the requirement of a running server, we have basically
what has been recently implemented with postgres -C for
runtime-computed GUCs, because we already go through a read of the
control file to be able to print those GUCs with their correct
values. This also means that it is already possible to check if a
data folder is compatible with a set of binaries with this facility,
as any postgres -C command with a runtime GUC would trigger this
check. Using any of the existing runtime GUCs may be confusing, but
that would work. And I am not really convinced that we have any need
to add a specific GUC for this purpose, be it a sort of
is_controlfile_valid or controlfile_checksum (CRC32 of the control
file).Thinking more about this one, we can already do that, so I have
marked the patch as RwF. Perhaps we could just add a GUC, but that
feels a bit dummy.
Sorry, I missed this thread earlier. You're right, we can just do
something like the following to achieve basically the same result:
postgres -D . -C data_checksums
Unless Vik has any objections, this can probably be marked as Withdrawn.
Perhaps we can look into providing a new option for "postgres" at some
point in the future, but I don't sense a ton of demand at the moment.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com