Always bump PG_CONTROL_VERSION?
Hackers,
I would like to propose bumping PG_CONTROL_VERSION with each release
even if there are no changes to the ControlFileData struct. Historically
PG_CONTROL_VERSION has only been bumped when there were changes to
ControlFileData.
pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL
when it is not running. If PG_CONTROL_VERSION does not change from a
prior version then we also use CATALOG_VERSION_NO to uniquely identify
the version.
This works fine, but is pretty fragile during the alpha/beta releases
when CATALOG_VERSION_NO is likely to change with each release. Of
course, PG_CONTROL_VERSION might change as well but this seems to be
extremely rare for an alpha/beta release.
There are a few commits like eeca4cd3 and 99dd8b05a that would seem to
argue that bumping PG_CONTROL_VERSION at least once for each release is
a good idea in general. It doesn't seem too useful to be able to run
pg_resetwal or pg_controldata against another version, in the few cases
that it would actually work, e.g. 9.6/9.5.
Thoughts?
--
-David
david@pgmasters.net
On 2021-May-12, David Steele wrote:
pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL
when it is not running. If PG_CONTROL_VERSION does not change from a prior
version then we also use CATALOG_VERSION_NO to uniquely identify the
version.
Why don't you use the PG_VERSION file in the datadir?
--
�lvaro Herrera 39�49'30"S 73�17'W
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)
Hi,
On 2021-05-12 14:58:11 -0400, David Steele wrote:
pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL
when it is not running. If PG_CONTROL_VERSION does not change from a prior
version then we also use CATALOG_VERSION_NO to uniquely identify the
version.
Why aren't you using PG_VERSION?
Greetings,
Andres Freund
On 5/12/21 3:21 PM, Alvaro Herrera wrote:
On 2021-May-12, David Steele wrote:
pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL
when it is not running. If PG_CONTROL_VERSION does not change from a prior
version then we also use CATALOG_VERSION_NO to uniquely identify the
version.Why don't you use the PG_VERSION file in the datadir?
Mostly because there is other data we need in pg_control and it is
simpler to read one file than two.
Regards,
--
-David
david@pgmasters.net
David Steele <david@pgmasters.net> writes:
On 5/12/21 3:21 PM, Alvaro Herrera wrote:
Why don't you use the PG_VERSION file in the datadir?
Mostly because there is other data we need in pg_control and it is
simpler to read one file than two.
I'm disinclined to change the longstanding rule in this area for
a reason as weak as that.
Even if we did change the rule going forward, you'd still need to
do it properly for existing releases, so I don't see that you're
going to save anything.
regards, tom lane
Hi,
On 2021-05-12 16:18:16 -0400, Tom Lane wrote:
Even if we did change the rule going forward, you'd still need to
do it properly for existing releases, so I don't see that you're
going to save anything.
It turns out that the last time a major version didn't have a unique
control file version was 9.5, I assume that's where David is coming
from.
That said, I don't think it's a good practice to use the control file
version as an identifier for the major version. Who knows, it might be
necessary to add an optional new format in a minor version at some point
or such crazyness. And then there's the beta stuff you'd mentioned, etc.
Greetings,
Andres Freund
On 5/12/21 4:18 PM, Tom Lane wrote:
David Steele <david@pgmasters.net> writes:
On 5/12/21 3:21 PM, Alvaro Herrera wrote:
Why don't you use the PG_VERSION file in the datadir?
Mostly because there is other data we need in pg_control and it is
simpler to read one file than two.I'm disinclined to change the longstanding rule in this area for
a reason as weak as that.
It's a bit more than that -- for instance we have pg_control in every
backup but in order to get PG_VERSION we may need to reference a prior
backup since it never changes. pg_control is also checked on every
archive_command/restore_command so reading an extra file adds up when
performance is paramount. There are also many unit tests that need to
write this data, etc.
In short, it would be very nice to have one place to get info about a
cluster.
Even if we did change the rule going forward, you'd still need to
do it properly for existing releases, so I don't see that you're
going to save anything.
It's not really a burden for existing releases. The issue is during the
alpha/beta phase when the CATALOG_VERSION_NO can change several times in
a few months.
Perhaps it was unwise to frame this in the requirements for an external
tool, but I still think eeca4cd3 and 99dd8b05a argue for it being a good
idea.
Or perhaps we could just add the version number to pg_control? At least
then pg_controldata could display the version.
Regards,
--
-David
david@pgmasters.net
On Wed, May 12, 2021 at 01:30:27PM -0700, Andres Freund wrote:
That said, I don't think it's a good practice to use the control file
version as an identifier for the major version. Who knows, it might be
necessary to add an optional new format in a minor version at some point
or such crazyness. And then there's the beta stuff you'd mentioned, etc.
Yes, PG_VERSION, as you wrote upthread already, is already fine for
the job, and FWIW, I have yet to see a case where being able to easily
detect the minor version in a data folder matters.
And, I am of the opinion to not change the control file version if
there is no need to do so.
--
Michael
On Thu, May 13, 2021 at 11:04:54AM +0900, Michael Paquier wrote:
On Wed, May 12, 2021 at 01:30:27PM -0700, Andres Freund wrote:
That said, I don't think it's a good practice to use the control file
version as an identifier for the major version. Who knows, it might be
necessary to add an optional new format in a minor version at some point
or such crazyness. And then there's the beta stuff you'd mentioned, etc.Yes, PG_VERSION, as you wrote upthread already, is already fine for
the job, and FWIW, I have yet to see a case where being able to easily
detect the minor version in a data folder matters.
Would it even make any sense? It could be "has version X.Y ever started the
cluster", or "was it the last version that started the cluster", or something
else?
And, I am of the opinion to not change the control file version if
there is no need to do so.
+1
On 5/12/21 10:04 PM, Michael Paquier wrote:
On Wed, May 12, 2021 at 01:30:27PM -0700, Andres Freund wrote:
That said, I don't think it's a good practice to use the control file
version as an identifier for the major version. Who knows, it might be
necessary to add an optional new format in a minor version at some point
or such crazyness. And then there's the beta stuff you'd mentioned, etc.Yes, PG_VERSION, as you wrote upthread already, is already fine for
the job, and FWIW, I have yet to see a case where being able to easily
detect the minor version in a data folder matters.
Regardless of PG_VERSION doing the job or not, shouldn't there be a bump
in PG_CONTROL_VERSION whenever there is a structural or semantic change
in the control file data? And wouldn't the easiest way to ensure that be
to bump the version with every release?
Also, can someone give me a good reason NOT to bump the version?
Thanks, Jan
--
Jan Wieck
Postgres User since 1994
Jan Wieck <jan@wi3ck.info> writes:
Regardless of PG_VERSION doing the job or not, shouldn't there be a bump
in PG_CONTROL_VERSION whenever there is a structural or semantic change
in the control file data? And wouldn't the easiest way to ensure that be
to bump the version with every release?
No, the way to do that is to change the version number in the commit
that changes the file's contents.
Also, can someone give me a good reason NOT to bump the version?
It creates unnecessary churn, not to mention a false sense of
complacency. Bumping the version in the commit that changes
things is not optional, because if you don't do that then you'll
probably burn some other developer also working on HEAD. So
I don't want people thinking they can skip this because it was
done at the beginning of the development cycle. We've learned
these things the hard way for CATVERSION. I think the only reason
that PG_CONTROL version or WAL version might seem different is
that we haven't changed them often enough for people to have fresh
memories of problems.
regards, tom lane
On 5/13/21 6:45 PM, Tom Lane wrote:
Bumping the version in the commit that changes
things is not optional, because if you don't do that then you'll
probably burn some other developer also working on HEAD. So
I don't want people thinking they can skip this because it was
done at the beginning of the development cycle.
And we make sure this is done how?
Regards, Jan
--
Jan Wieck
Postgres User since 1994
Jan Wieck <jan@wi3ck.info> writes:
On 5/13/21 6:45 PM, Tom Lane wrote:
Bumping the version in the commit that changes
things is not optional, because if you don't do that then you'll
probably burn some other developer also working on HEAD. So
I don't want people thinking they can skip this because it was
done at the beginning of the development cycle.
And we make sure this is done how?
Peer pressure? It's not that different from N other ways to
do a patch incorrectly, of course.
regards, tom lane
Hi,
On 2021-05-13 17:42:52 -0400, Jan Wieck wrote:
Also, can someone give me a good reason NOT to bump the version?
There's several types of tools (particularly around backup) that need to
parse control files. Unnecessarily increasing the numbers of versions
that need to be dealt with makes that a bit harder.
Greetings,
Andres Freund
On Mon, May 17, 2021 at 03:47:01PM -0700, Andres Freund wrote:
There's several types of tools (particularly around backup) that need to
parse control files. Unnecessarily increasing the numbers of versions
that need to be dealt with makes that a bit harder.
I am digressing here, sorry for that..
But it is worth noting that it is fun to debug issues where a user
changes the system locale and breaks some logic that parses the output
of pg_controldata because of the translations of the text fields.
I've really wondered over the years whether there should be more
switches to pick up only the field values, for the popular ones like
TLIs for example. pg_config does that.
--
Michael