pg_validatebackup -> pg_verifybackup?
Over at /messages/by-id/172c9d9b-1d0a-1b94-1456-376b1e017322@2ndquadrant.com
Peter Eisentraut suggests that pg_validatebackup should be called
pg_verifybackup, with corresponding terminology changes throughout the
code and documentation.
Here's a patch for that. I'd like to commit this quickly or abandon in
quickly, because large renaming patches like this are a pain to
maintain. I believe that there was a mild consensus in favor of this
on that thread, so I plan to go forward unless somebody shows up
pretty quickly to object.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
0001-Rename-pg_validatebackup-to-pg_verifybackup.patchapplication/octet-stream; name=0001-Rename-pg_validatebackup-to-pg_verifybackup.patchDownload+141-142
Robert Haas <robertmhaas@gmail.com> writes:
Over at /messages/by-id/172c9d9b-1d0a-1b94-1456-376b1e017322@2ndquadrant.com
Peter Eisentraut suggests that pg_validatebackup should be called
pg_verifybackup, with corresponding terminology changes throughout the
code and documentation.
Here's a patch for that. I'd like to commit this quickly or abandon in
quickly, because large renaming patches like this are a pain to
maintain. I believe that there was a mild consensus in favor of this
on that thread, so I plan to go forward unless somebody shows up
pretty quickly to object.
+1, let's get it done.
regards, tom lane
On 4/10/20 11:37 AM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Over at /messages/by-id/172c9d9b-1d0a-1b94-1456-376b1e017322@2ndquadrant.com
Peter Eisentraut suggests that pg_validatebackup should be called
pg_verifybackup, with corresponding terminology changes throughout the
code and documentation.Here's a patch for that. I'd like to commit this quickly or abandon in
quickly, because large renaming patches like this are a pain to
maintain. I believe that there was a mild consensus in favor of this
on that thread, so I plan to go forward unless somebody shows up
pretty quickly to object.+1, let's get it done.
I'm not sure that Peter suggested verify was the correct name, he just
pointed out that verify and validate are not necessarily the same thing
(and that we should be consistent in the docs one way or the other).
It'd be nice if Peter (now CC'd) commented since he's the one who
brought it up.
Having said that, I'm +1 on verify.
Regards,
--
-David
david@pgmasters.net
David Steele <david@pgmasters.net> writes:
Having said that, I'm +1 on verify.
Me too, if only because it's shorter.
regards, tom lane
On 4/10/20 3:27 PM, Tom Lane wrote:
David Steele <david@pgmasters.net> writes:
Having said that, I'm +1 on verify.
Me too, if only because it's shorter.
I also think it is (probably) more correct but failing that it is
*definitely* shorter!
--
-David
david@pgmasters.net
Hi,
On 2020-04-10 14:56:48 -0400, David Steele wrote:
On 4/10/20 11:37 AM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Over at /messages/by-id/172c9d9b-1d0a-1b94-1456-376b1e017322@2ndquadrant.com
Peter Eisentraut suggests that pg_validatebackup should be called
pg_verifybackup, with corresponding terminology changes throughout the
code and documentation.Here's a patch for that. I'd like to commit this quickly or abandon in
quickly, because large renaming patches like this are a pain to
maintain. I believe that there was a mild consensus in favor of this
on that thread, so I plan to go forward unless somebody shows up
pretty quickly to object.+1, let's get it done.
I'm not sure that Peter suggested verify was the correct name, he just
pointed out that verify and validate are not necessarily the same thing (and
that we should be consistent in the docs one way or the other). It'd be nice
if Peter (now CC'd) commented since he's the one who brought it up.Having said that, I'm +1 on verify.
FWIW, I still think it's a mistake to accumulate all these bespoke
tools. We should go towards having one tool that can verify checksums,
validate backup manifests etc. Partially because it's more discoverable,
but also because it allows to verify multiple such properties in a
single pass, rather than reading the huge base backup twice.
Greetings,
Andres Freund
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2020-04-10 14:56:48 -0400, David Steele wrote:
On 4/10/20 11:37 AM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Over at /messages/by-id/172c9d9b-1d0a-1b94-1456-376b1e017322@2ndquadrant.com
Peter Eisentraut suggests that pg_validatebackup should be called
pg_verifybackup, with corresponding terminology changes throughout the
code and documentation.Here's a patch for that. I'd like to commit this quickly or abandon in
quickly, because large renaming patches like this are a pain to
maintain. I believe that there was a mild consensus in favor of this
on that thread, so I plan to go forward unless somebody shows up
pretty quickly to object.+1, let's get it done.
I'm not sure that Peter suggested verify was the correct name, he just
pointed out that verify and validate are not necessarily the same thing (and
that we should be consistent in the docs one way or the other). It'd be nice
if Peter (now CC'd) commented since he's the one who brought it up.Having said that, I'm +1 on verify.
FWIW, I still think it's a mistake to accumulate all these bespoke
tools. We should go towards having one tool that can verify checksums,
validate backup manifests etc. Partially because it's more discoverable,
but also because it allows to verify multiple such properties in a
single pass, rather than reading the huge base backup twice.
Would be kinda neat to have a single tool for doing backups and restores
too, as well as validating backup manifests and checksums, that can back
up to s3 or to a remote system with ssh, has multiple compression
options and a pretty sound architecture that's all written in C and is
OSS.
I also agree with Tom/David that verify probably makes sense for this
command, in its current form at least.
Thanks,
Stephen
Andres Freund <andres@anarazel.de> writes:
FWIW, I still think it's a mistake to accumulate all these bespoke
tools. We should go towards having one tool that can verify checksums,
validate backup manifests etc. Partially because it's more discoverable,
but also because it allows to verify multiple such properties in a
single pass, rather than reading the huge base backup twice.
Well, we're not getting there for v13. Are you proposing that this
patch just be reverted because it doesn't do everything at once?
regards, tom lane
Hi,
On 2020-04-10 16:13:18 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
FWIW, I still think it's a mistake to accumulate all these bespoke
tools. We should go towards having one tool that can verify checksums,
validate backup manifests etc. Partially because it's more discoverable,
but also because it allows to verify multiple such properties in a
single pass, rather than reading the huge base backup twice.Well, we're not getting there for v13. Are you proposing that this
patch just be reverted because it doesn't do everything at once?
No. I suggest choosing a name that's compatible with moving more
capabilities under the same umbrella at a later time (and I suggested
the same pre freeze too). Possibly adding a toplevel --verify-manifest
option as the only change besides naming.
Andres
Andres Freund <andres@anarazel.de> writes:
On 2020-04-10 16:13:18 -0400, Tom Lane wrote:
Well, we're not getting there for v13. Are you proposing that this
patch just be reverted because it doesn't do everything at once?
No. I suggest choosing a name that's compatible with moving more
capabilities under the same umbrella at a later time (and I suggested
the same pre freeze too). Possibly adding a toplevel --verify-manifest
option as the only change besides naming.
It doesn't really seem like either name is problematic from that
standpoint? "Verify backup" isn't prejudging what aspect of the
backup is going to be verified, AFAICS.
regards, tom lane
Hi,
On 2020-04-10 16:40:02 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2020-04-10 16:13:18 -0400, Tom Lane wrote:
Well, we're not getting there for v13. Are you proposing that this
patch just be reverted because it doesn't do everything at once?No. I suggest choosing a name that's compatible with moving more
capabilities under the same umbrella at a later time (and I suggested
the same pre freeze too). Possibly adding a toplevel --verify-manifest
option as the only change besides naming.It doesn't really seem like either name is problematic from that
standpoint? "Verify backup" isn't prejudging what aspect of the
backup is going to be verified, AFAICS.
My point is that I'd eventually like to see the same tool also be usable
to just verify the checksums of a normal, non-backup, data directory.
We shouldn't end up with pg_verifybackup, pg_checksums,
pg_dbdir_checknofilesmissing, pg_checkpageheaders, ...
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2020-04-10 16:40:02 -0400, Tom Lane wrote:
It doesn't really seem like either name is problematic from that
standpoint? "Verify backup" isn't prejudging what aspect of the
backup is going to be verified, AFAICS.
My point is that I'd eventually like to see the same tool also be usable
to just verify the checksums of a normal, non-backup, data directory.
Meh. I would argue that that's an actively BAD idea. The use-cases
are entirely different, the implementation is going to be quite a lot
different, the relevant options are going to be quite a lot different.
It will not be better for either implementors or users to force those
into the same executable.
regards, tom lane
Hi,
On 2020-04-10 17:23:58 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2020-04-10 16:40:02 -0400, Tom Lane wrote:
It doesn't really seem like either name is problematic from that
standpoint? "Verify backup" isn't prejudging what aspect of the
backup is going to be verified, AFAICS.My point is that I'd eventually like to see the same tool also be usable
to just verify the checksums of a normal, non-backup, data directory.Meh. I would argue that that's an actively BAD idea. The use-cases
are entirely different, the implementation is going to be quite a lot
different, the relevant options are going to be quite a lot different.
It will not be better for either implementors or users to force those
into the same executable.
I don't agree with any of that. Combining the manifest validation with
checksum validation halves the IO. It allows to offload some of the
expense of verifying page level checksums from the primary.
And all of the operations require iterating through data directories,
classify files that are part / not part of a normal data directory, etc.
Greetings,
Andres Freund
On Fri, Apr 10, 2020 at 5:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Meh. I would argue that that's an actively BAD idea. The use-cases
are entirely different, the implementation is going to be quite a lot
different, the relevant options are going to be quite a lot different.
It will not be better for either implementors or users to force those
into the same executable.
I think Andres has a point, but on balance I am more inclined to agree
with you. It may be that in the future it will make sense to organize
things differently, but I would rather arrange them according to what
makes sense now, and then change it later, even though that makes for
some user-visible churn. The thing is that we don't really know what's
going to happen in future releases, and our track record when we try
to guess is very poor. Creating stuff that we hope will get extended
to do this or that in a future release can end up looking really
half-baked if the code doesn't get extended in the anticipated
direction.
I *would* like to find a way to address the proliferation of binaries,
because I've got other things I'd like to do that would require
creating still more of them, and until we come up with a scalable
solution that makes everybody happy, there's going to be progressively
more complaining every time. One possible solution is to adopt the
'git' approach and decide we're going to have one 'pg' command (or
whatever we call it). I think the way that 'git' does it is that all
of the real binaries are stored in a directory that users are not
expected to have in their path, and the 'git' wrapper just looks for
one based on the name of the subcommand. So, if you say 'git thunk',
it goes and searches the private bin directory for an executable
called 'git-thunk'. We could easily do this for nearly everything 'pg'
related, so:
clusterdb -> pg clusterdb
pg_ctl -> pg ctl
pg_resetwal -> pg resetwal
etc.
I think we'd want psql to still be separate, but I'm not sure we'd
need any other exceptions. In a lot of cases it won't lead to any more
typing because the current command is pg_whatever and with this
approach you'd just type a space instead of an underscore. The
"legacy" cases that don't start with "pg" would get a bit longer, but
I wouldn't lose a lot of sleep over that myself.
There are other possibilities too. We could try to pick out individual
groups of commands to merge, rather than having a unified framework
for everything. For instance, we could turn
{cluster,create,drop,reindex,vacuum}db into one utility,
{create,drop}user into another, pg_dump{,all} into a third, and so on.
But this seems like it would require making a lot more awkward policy
decisions, so I don't think it's a great plan.
Still, we need to agree on something. It won't do to tell people that
we're not going to commit patches to add more functionality to
PostgreSQL because it would involve adding more binaries. Any
particular piece of functionality may draw substantive objections, and
that's fine, but we shouldn't stifle development categorically because
we can't agree on how to clean up the namespace pollution.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2020-Apr-11, Robert Haas wrote:
I *would* like to find a way to address the proliferation of binaries,
because I've got other things I'd like to do that would require
creating still more of them, and until we come up with a scalable
solution that makes everybody happy, there's going to be progressively
more complaining every time. One possible solution is to adopt the
'git' approach and decide we're going to have one 'pg' command (or
whatever we call it). I think the way that 'git' does it is that all
of the real binaries are stored in a directory that users are not
expected to have in their path, and the 'git' wrapper just looks for
one based on the name of the subcommand.
I like this idea so much that I already proposed it in the past[1]/messages/by-id/20160826202911.GA320593@alvherre.pgsql, so +1.
[1]: /messages/by-id/20160826202911.GA320593@alvherre.pgsql
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Apr 10, 2020 at 02:48:05PM -0700, Andres Freund wrote:
I don't agree with any of that. Combining the manifest validation with
checksum validation halves the IO. It allows to offload some of the
expense of verifying page level checksums from the primary.And all of the operations require iterating through data directories,
classify files that are part / not part of a normal data directory, etc.
The last time we had the idea to use _verify_ in a tool name, the same
tool has been renamed one year after as we found new use cases for
it, aka pg_checksums. Cannot the same be said with pg_validatebackup?
It seems to me that it could be interesting for some users to build a
manifest after a backup is taken, using something like a --build
option with pg_validatebackup.
--
Michael
On Sat, Apr 11, 2020 at 05:50:56PM -0400, Alvaro Herrera wrote:
On 2020-Apr-11, Robert Haas wrote:
I *would* like to find a way to address the proliferation of binaries,
because I've got other things I'd like to do that would require
creating still more of them, and until we come up with a scalable
solution that makes everybody happy, there's going to be progressively
more complaining every time. One possible solution is to adopt the
'git' approach and decide we're going to have one 'pg' command (or
whatever we call it). I think the way that 'git' does it is that all
of the real binaries are stored in a directory that users are not
expected to have in their path, and the 'git' wrapper just looks for
one based on the name of the subcommand.I like this idea so much that I already proposed it in the past[1], so +1.
Yeah, their stuff is nice. Another nice thing is that git has the
possibility to scan as well for custom scripts as long as they respect
the naming convention, like having a custom script called "git-foo",
that can be called as "git foo".
--
Michael
On Sat, 11 Apr 2020 at 19:36, Michael Paquier <michael@paquier.xyz> wrote:
Yeah, their stuff is nice. Another nice thing is that git has the
possibility to scan as well for custom scripts as long as they respect
the naming convention, like having a custom script called "git-foo",
that can be called as "git foo".
… which could be installed by an extension perhaps. Wait, that doesn't
quite work because it's usually one bin directory per version, not per
database. Still maybe something can be done with that idea.
On Sat, Apr 11, 2020 at 5:51 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I like this idea so much that I already proposed it in the past[1], so +1.
Hey, look at that. I think I had some vague recollection of a prior
proposal, but I couldn't remember exactly who or exactly what had been
proposed. I do think that pg_ctl is too long a prefix, though. People
can get used to typing 'pg createdb' instead of 'createdb' but 'pg_ctl
createdb' seems like too much. At least, it would very very quickly
cause me to install aliases.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Sat, Apr 11, 2020 at 5:51 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I like this idea so much that I already proposed it in the past[1], so +1.
Hey, look at that. I think I had some vague recollection of a prior
proposal, but I couldn't remember exactly who or exactly what had been
proposed. I do think that pg_ctl is too long a prefix, though. People
can get used to typing 'pg createdb' instead of 'createdb' but 'pg_ctl
createdb' seems like too much. At least, it would very very quickly
cause me to install aliases.
Yeah, I'd be happier with "pg" than "pg_ctl" as well. But it's so
short that I wonder if some other software has already adopted it.
regards, tom lane