Duplicate history file?

Started by Kyotaro Horiguchiover 4 years ago34 messages

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

So, this is the new new thread.

This thread should have been started here:

/messages/by-id/20210531.165825.921389284096975508.horikyota.ntt@gmail.com

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

tatsuro.yamada.tf@nttcom.co.jp

over 4 years ago

In reply to: Kyotaro Horiguchi (#1)

Re: Duplicate history file?

Hi Horiguchi-san,

This thread should have been started here:

/messages/by-id/20210531.165825.921389284096975508.horikyota.ntt@gmail.com

(To recap: In a replication set using archive, startup tries to
restore WAL files from archive before checking pg_wal directory for
the desired file. The behavior itself is intentionally designed and
reasonable. However, the restore code notifies of a restored file
regardless of whether it has been already archived or not. If
archive_command is written so as to return error for overwriting as we
suggest in the documentation, that behavior causes archive failure.)

After playing with this, I see the problem just by restarting a
standby even in a simple archive-replication set after making
not-special prerequisites. So I think this is worth fixing.

With this patch, KeepFileRestoredFromArchive compares the contents of
just-restored file and the existing file for the same segment only
when:

- archive_mode = always
and - the file to restore already exists in pgwal
and - it has a .done and/or .ready status file.

which doesn't happen usually. Then the function skips archive
notification if the contents are identical. The included TAP test is
working both on Linux and Windows.

Thank you for the analysis and the patch.
I'll try the patch tomorrow.

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again.

I've tried your patch. Unfortunately, it didn't seem to have any good
effect on the script I sent to reproduce the problem.

I understand that, as Stefan says, the test and cp commands have
problems and should not be used for archive commands. Maybe this is not
a big problem for the community.
Nevertheless, even if we do not improve the feature, I think it is a
good idea to explicitly state in the documentation that archiving may
fail under certain conditions for new users.

I'd like to hear the opinions of experts on the archive command.

P.S.
My customer's problem has already been solved, so it's ok. I've
emailed -hackers with the aim of preventing users from encountering
the same problem.

Regards,
Tatsuro Yamada

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Tatsuro Yamada (#2)

1 attachment(s)

Re: Duplicate history file?

On 2021/06/08 18:19, Tatsuro Yamada wrote:

I've tried your patch. Unfortunately, it didn't seem to have any good
effect on the script I sent to reproduce the problem.

Oops! The patch forgot about history files.

I checked the attached with your repro script and it works fine.

I understand that, as Stefan says, the test and cp commands have
problems and should not be used for archive commands. Maybe this is not
a big problem for the community.
Nevertheless, even if we do not improve the feature, I think it is a
good idea to explicitly state in the documentation that archiving may
fail under certain conditions for new users.

I'd like to hear the opinions of experts on the archive command.

P.S.
My customer's problem has already been solved, so it's ok. I've
emailed -hackers with the aim of preventing users from encountering
the same problem.

I understand that.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

avoid_duplicate_archiving_PoC3.patch.txttext/plain; charset=UTF-8; name=avoid_duplicate_archiving_PoC3.patch.txtDownload

diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 26b023e754..1da3c93f97 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -50,6 +50,7 @@
  * when we are not yet sure how far back we need the WAL.
  */
 bool
+
 RestoreArchivedFile(char *path, const char *xlogfname,
 					const char *recovername, off_t expectedSize,
 					bool cleanupEnabled)
@@ -382,6 +383,7 @@ KeepFileRestoredFromArchive(const char *path, const char *xlogfname)
 {
 	char		xlogfpath[MAXPGPATH];
 	bool		reload = false;
+	bool		already_archived = false;
 	struct stat statbuf;
 
 	snprintf(xlogfpath, MAXPGPATH, XLOGDIR "/%s", xlogfname);
@@ -389,6 +391,7 @@ KeepFileRestoredFromArchive(const char *path, const char *xlogfname)
 	if (stat(xlogfpath, &statbuf) == 0)
 	{
 		char		oldpath[MAXPGPATH];
+		size_t		flen = statbuf.st_size;
 
 #ifdef WIN32
 		static unsigned int deletedcounter = 1;
@@ -416,6 +419,76 @@ KeepFileRestoredFromArchive(const char *path, const char *xlogfname)
 		/* same-size buffers, so this never truncates */
 		strlcpy(oldpath, xlogfpath, MAXPGPATH);
 #endif
+		/*
+		 * On a standby with archive_mode=always, there's a case where the same
+		 * file is archived more than once. If the archive_command rejects
+		 * overwriting, WAL-archiving won't go further than the file forever.
+		 * Avoid duplicate archiving attempts when the file with the same
+		 * content is known to have been already archived or notified.
+		 */
+		if (XLogArchiveMode == ARCHIVE_MODE_ALWAYS &&
+			XLogArchiveIsReadyOrDone(xlogfname) &&
+			stat(path, &statbuf) == 0 && statbuf.st_size == flen)
+		{
+			int fd1;
+			int fd2 = -1;
+
+			fd1 = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+			if (fd1 >= 0)
+				fd2 = BasicOpenFile(oldpath, O_RDONLY | PG_BINARY);
+
+			if (fd1 < 0 || fd2 < 0)
+			{
+				ereport(WARNING,
+						(errcode_for_file_access(),
+						 errmsg("could not open file \"%s\", skip duplicate check: %m",
+								fd1 < 0 ? path : oldpath)));
+				if (fd1 >= 0)
+					close(fd1);
+			}
+			else
+			{
+				unsigned char srcbuf[XLOG_BLCKSZ];
+				unsigned char dstbuf[XLOG_BLCKSZ];
+				size_t rlen;
+				uint32 i;
+				int r;
+
+				/*
+				 * Compare the two files' contents.  We don't bother
+				 * completing if something's wrong meanwhile.
+				 */
+				rlen = 0;
+				r = XLOG_BLCKSZ;
+				for (i = 0 ; r == XLOG_BLCKSZ ; i++)
+				{
+					
+					if ((r = read(fd1, srcbuf, XLOG_BLCKSZ)) == 0)
+						break;
+
+					if (read(fd2, dstbuf, XLOG_BLCKSZ) != r)
+						break;
+
+					if (memcmp(srcbuf, dstbuf, r) != 0)
+						break;
+
+					rlen += r;
+				}
+
+				close(fd1);
+				close(fd2);
+
+				if (rlen == flen)
+				{
+					already_archived = true;
+
+					ereport(LOG,
+							(errmsg("log file \"%s\" have been already archived, skip archiving",
+									xlogfname)));
+				}
+			}
+		}
+
 		if (unlink(oldpath) != 0)
 			ereport(FATAL,
 					(errcode_for_file_access(),
@@ -432,7 +505,7 @@ KeepFileRestoredFromArchive(const char *path, const char *xlogfname)
 	 */
 	if (XLogArchiveMode != ARCHIVE_MODE_ALWAYS)
 		XLogArchiveForceDone(xlogfname);
-	else
+	else if (!already_archived)
 		XLogArchiveNotify(xlogfname);
 
 	/*

tatsuro.yamada.tf@nttcom.co.jp

over 4 years ago

In reply to: Kyotaro Horiguchi (#3)

Re: Duplicate history file?

Hi Horiguchi-san,

On 2021/06/09 11:47, Kyotaro Horiguchi wrote:

On 2021/06/08 18:19, Tatsuro Yamada wrote:

I've tried your patch. Unfortunately, it didn't seem to have any good
effect on the script I sent to reproduce the problem.

Oops! The patch forgot about history files.

I checked the attached with your repro script and it works fine.

Thank you for fixing the patch.
The new patch works well in my environment. :-D

Regards,
Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

over 4 years ago

In reply to: Tatsuro Yamada (#4)

Re: Duplicate history file?

Hi

Thank you for fixing the patch.
The new patch works well in my environment. :-D

This may not be important at this time since it is a
PoC patch, but I would like to inform you that there
was a line that contained multiple spaces instead of tabs.

$ git diff --check
src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
+

Regards,
Tatsuro Yamada

masao.fujii@oss.nttdata.com

over 4 years ago

In reply to: Tatsuro Yamada (#5)

Re: Duplicate history file?

On 2021/06/09 15:58, Tatsuro Yamada wrote:

Hi

Thank you for fixing the patch.
The new patch works well in my environment. :-D

This may not be important at this time since it is a
PoC patch, but I would like to inform you that there
was a line that contained multiple spaces instead of tabs.

$ git diff --check
src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
+

Even with the patch, if "test ! -f ..." is used in archive_command,
you may still *easily* get the trouble that WAL archiving keeps failing?

Instead, we should consider and document "better" command for
archive_command, or implement something like pg_archivecopy command
into the core (as far as I remember, there was the discussion about
this feature before...)?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

tatsuro.yamada.tf@nttcom.co.jp

over 4 years ago

In reply to: Fujii Masao (#6)

Re: Duplicate history file?

Hi,

On 2021/06/09 16:23, Fujii Masao wrote:

On 2021/06/09 15:58, Tatsuro Yamada wrote:

This may not be important at this time since it is a
PoC patch, but I would like to inform you that there
was a line that contained multiple spaces instead of tabs.

$ git diff --check
src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
+

Even with the patch, if "test ! -f ..." is used in archive_command,
you may still *easily* get the trouble that WAL archiving keeps failing?

Thanks for your comment.

Yes, it may solve the error when using the test command, but it is
dangerous to continue using the cp command, which is listed as an
example of an archive command.

Instead, we should consider and document "better" command for
archive_command, or implement something like pg_archivecopy command
into the core (as far as I remember, there was the discussion about
this feature before...)?

I agree with that idea.
Since archiving is important for all users, I think there should be
either a better and safer command in the documentation, or an archive
command (pg_archivecopy?) that we provide as a community, as you said.
I am curious about the conclusions of past discussions. :)

Regards,
Tatsuro Yamada

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Tatsuro Yamada (#7)

Re: Duplicate history file?

At Wed, 09 Jun 2021 16:56:14 +0900, Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> wrote in

Hi,

On 2021/06/09 16:23, Fujii Masao wrote:

On 2021/06/09 15:58, Tatsuro Yamada wrote:

This may not be important at this time since it is a
PoC patch, but I would like to inform you that there
was a line that contained multiple spaces instead of tabs.

$ git diff --check
src/backend/access/transam/xlogarchive.c:465: trailing whitespace.
+

Even with the patch, if "test ! -f ..." is used in archive_command,
you may still *easily* get the trouble that WAL archiving keeps
failing?

I'm not sure, but in regard to the the cause that the patch treats, if
an already-archived file is recycled or deleted then the same file is
restored from archive, that could happen. But the WAL segment that
contains the latest checkpoint won't be deleted. The same can be said
on history files.

Thanks for your comment.

Yes, it may solve the error when using the test command, but it is
dangerous to continue using the cp command, which is listed as an
example of an archive command.

"test" command?

At first I thought that the archive command needs to compare the whole
file content *always*, but that happens with the same frequency with
the patch runs a whole-file comparison.

Instead, we should consider and document "better" command for
archive_command, or implement something like pg_archivecopy command
into the core (as far as I remember, there was the discussion about
this feature before...)?

I agree with that idea.
Since archiving is important for all users, I think there should be
either a better and safer command in the documentation, or an archive
command (pg_archivecopy?) that we provide as a community, as you said.
I am curious about the conclusions of past discussions. :)

How perfect the officially-provided script or command need to be? The
reason that the script in the documentation is so simple is, I guess,
we don't/can't offer a steps sufficiently solid for all-directions.

Since we didn't noticed that the "test ! -f" harms so it has been
there but finally we need to remove it. Instead, we need to write
doen the known significant requirements by words. I'm afraid that the
concrete script would be a bit complex for the documentation..

So what we can do that is:

- Remove the "test ! -f" from the sample command (for *nixen).

- Rewrite at least the following portion in the documentation. [1]

The archive command should generally be designed to refuse to
overwrite any pre-existing archive file. This is an important
safety feature to preserve the integrity of your archive in case
of administrator error (such as sending the output of two
different servers to the same archive directory).

It is advisable to test your proposed archive command to ensure
that it indeed does not overwrite an existing file, and that it
returns nonzero status in this case. The example command above
for Unix ensures this by including a separate test step. On some
Unix platforms, cp has switches such as -i that can be used to do
the same thing less verbosely, but you should not rely on these
without verifying that the right exit status is returned. (In
particular, GNU cp will return status zero when -i is used and
the target file already exists, which is not the desired
behavior.)

The replacement would be something like:

"There is a case where WAL file and timeline history files is archived
more than once. The archive command should generally be designed to
refuse to replace any pre-existing archive file with a file with
different content but to return zero if the file to be archived is
identical with the preexisting file."

But I'm not sure how it looks like.. (even ignoring the broken
phrasing..)

1: https://www.postgresql.org/docs/11/continuous-archiving.html

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

sfrost@snowman.net

over 4 years ago

In reply to: Kyotaro Horiguchi (#8)

Re: Duplicate history file?

Greetings,

* Kyotaro Horiguchi (horikyota.ntt@gmail.com) wrote:

At Wed, 09 Jun 2021 16:56:14 +0900, Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> wrote in

On 2021/06/09 16:23, Fujii Masao wrote:

Instead, we should consider and document "better" command for
archive_command, or implement something like pg_archivecopy command
into the core (as far as I remember, there was the discussion about
this feature before...)?

I agree with that idea.
Since archiving is important for all users, I think there should be
either a better and safer command in the documentation, or an archive
command (pg_archivecopy?) that we provide as a community, as you said.
I am curious about the conclusions of past discussions. :)

How perfect the officially-provided script or command need to be? The
reason that the script in the documentation is so simple is, I guess,
we don't/can't offer a steps sufficiently solid for all-directions.

Since we didn't noticed that the "test ! -f" harms so it has been
there but finally we need to remove it. Instead, we need to write
doen the known significant requirements by words. I'm afraid that the
concrete script would be a bit complex for the documentation..

We don't have any 'officially-provided' tool for archive command.

So what we can do that is:

- Remove the "test ! -f" from the sample command (for *nixen).

... or just remove the example entirely. It really doesn't do anything
good for us, in my view.

- Rewrite at least the following portion in the documentation. [1]

The archive command should generally be designed to refuse to
overwrite any pre-existing archive file. This is an important
safety feature to preserve the integrity of your archive in case
of administrator error (such as sending the output of two
different servers to the same archive directory).

It is advisable to test your proposed archive command to ensure
that it indeed does not overwrite an existing file, and that it
returns nonzero status in this case. The example command above
for Unix ensures this by including a separate test step. On some
Unix platforms, cp has switches such as -i that can be used to do
the same thing less verbosely, but you should not rely on these
without verifying that the right exit status is returned. (In
particular, GNU cp will return status zero when -i is used and
the target file already exists, which is not the desired
behavior.)

The replacement would be something like:

"There is a case where WAL file and timeline history files is archived
more than once. The archive command should generally be designed to
refuse to replace any pre-existing archive file with a file with
different content but to return zero if the file to be archived is
identical with the preexisting file."

But I'm not sure how it looks like.. (even ignoring the broken
phrasing..)

There is so much more that we should be including here, like "you should
make sure your archive command will reliably sync the WAL file to disk
before returning success to PG, since PG will feel free to immediately
remove the WAL file once archive command has returned successfully", and
"the archive command should check that there exists a .history file for
any timeline after timeline 1 in the repo for the WAL file that's being
archived" and "the archive command should allow the exist, binary
identical, WAL file to be archived multiple times without error, but
should error if a new WAL file is archived which would overwrite a
binary distinct WAL file in the repo", and "the archive command should
check the WAL header to make sure that the WAL file matches the cluster
in the corresponding backup repo", and "whatever is expiring the WAL
files after they've been archived should make sure to not expire out any
WAL that is needed for any of the backups that remain", and "oh, by the
way, depending on the exit code of the command, PG may consider the
failure to be something which can be retried, or not", and other things
that I can't think of off the top of my head right now.

I have to say that it gets to a point where it feels like we're trying
to document everything about writing a C extension to PG using the
hooks which we make available. We've generally agreed that folks should
be looking at the source code if they're writing a serious C extension
and it's certainly the case that, in writing a serious archive command
and backup tool, getting into the PG source code has been routinely
necessary.

Thanks,

Stephen

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Stephen Frost (#9)

Re: Duplicate history file?

At Thu, 10 Jun 2021 10:00:21 -0400, Stephen Frost <sfrost@snowman.net> wrote in

Greetings,

* Kyotaro Horiguchi (horikyota.ntt@gmail.com) wrote:

At Wed, 09 Jun 2021 16:56:14 +0900, Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> wrote in

On 2021/06/09 16:23, Fujii Masao wrote:

Instead, we should consider and document "better" command for
archive_command, or implement something like pg_archivecopy command
into the core (as far as I remember, there was the discussion about
this feature before...)?

I agree with that idea.
Since archiving is important for all users, I think there should be
either a better and safer command in the documentation, or an archive
command (pg_archivecopy?) that we provide as a community, as you said.
I am curious about the conclusions of past discussions. :)

How perfect the officially-provided script or command need to be? The
reason that the script in the documentation is so simple is, I guess,
we don't/can't offer a steps sufficiently solid for all-directions.

Since we didn't noticed that the "test ! -f" harms so it has been
there but finally we need to remove it. Instead, we need to write
doen the known significant requirements by words. I'm afraid that the
concrete script would be a bit complex for the documentation..

We don't have any 'officially-provided' tool for archive command.

I meant the "test ! -f .." by the "officially-provided script". The
fact we show it in the documentation (without a caveart) means that
the script at least doesn't break the server behavior that is running
normally including promotion.

So what we can do that is:

- Remove the "test ! -f" from the sample command (for *nixen).

... or just remove the example entirely. It really doesn't do anything
good for us, in my view.

Yeah. I feel like so. But that also means the usage instruction of the
replacements disappears from our documentation. The least problematic
example in the regards above is just "cp .." without "test" as the
instruction.

The replacement would be something like:

"There is a case where WAL file and timeline history files is archived
more than once. The archive command should generally be designed to
refuse to replace any pre-existing archive file with a file with
different content but to return zero if the file to be archived is
identical with the preexisting file."

But I'm not sure how it looks like.. (even ignoring the broken
phrasing..)

There is so much more that we should be including here, like "you should

Mmm. Yeah, I can understand your sentiment maybe completely.

make sure your archive command will reliably sync the WAL file to disk
before returning success to PG, since PG will feel free to immediately
remove the WAL file once archive command has returned successfully", and
"the archive command should check that there exists a .history file for
any timeline after timeline 1 in the repo for the WAL file that's being
archived" and "the archive command should allow the exist, binary
identical, WAL file to be archived multiple times without error, but
should error if a new WAL file is archived which would overwrite a
binary distinct WAL file in the repo", and "the archive command should
check the WAL header to make sure that the WAL file matches the cluster
in the corresponding backup repo", and "whatever is expiring the WAL
files after they've been archived should make sure to not expire out any
WAL that is needed for any of the backups that remain", and "oh, by the
way, depending on the exit code of the command, PG may consider the
failure to be something which can be retried, or not", and other things
that I can't think of off the top of my head right now.
I have to say that it gets to a point where it feels like we're trying
to document everything about writing a C extension to PG using the
hooks which we make available. We've generally agreed that folks should
be looking at the source code if they're writing a serious C extension
and it's certainly the case that, in writing a serious archive command
and backup tool, getting into the PG source code has been routinely
necessary.

Nevertheless I agree to it, still don't we need a minimum workable
setup as the first step? Something like below.

===
The following is an example of the minimal archive_command.

Example: cp %p /blah/%f

However, it is far from perfect. The following is the discussion about
what is needed for archive_command to be more reliable.

<the long list of the requirements>
====

Anyway it doesn't seem to be the time to do that, but as now that we
know that there's a case where the current example doesn't prevent PG
from working correctly, we cannot use the "test ! -f" example and
cannot suggest "do not overwrite existing archived files" without a
caveat. At least don't we need to *fix* that parts for now?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Kyotaro Horiguchi (#10)

Re: Duplicate history file?

Recently my brain is always twisted..

At Fri, 11 Jun 2021 11:25:51 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

Anyway it doesn't seem to be the time to do that, but as now that we

- know that there's a case where the current example doesn't prevent PG
+ know that there's a case where the current example prevents PG

from working correctly, we cannot use the "test ! -f" example and
cannot suggest "do not overwrite existing archived files" without a
caveat. At least don't we need to *fix* that parts for now?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

rjuju123@gmail.com

over 4 years ago

In reply to: Kyotaro Horiguchi (#10)

Re: Duplicate history file?

On Fri, Jun 11, 2021 at 11:25:51AM +0900, Kyotaro Horiguchi wrote:

Nevertheless I agree to it, still don't we need a minimum workable
setup as the first step? Something like below.

===
The following is an example of the minimal archive_command.

Example: cp %p /blah/%f

However, it is far from perfect. The following is the discussion about
what is needed for archive_command to be more reliable.

<the long list of the requirements>
====

"far from perfect" is a strong understatement for "appears to work but will
randomly and silently breaks everything without a simple way to detect it".

We should document a minimum workable setup, but cp isn't an example of that,
and I don't think that there will be a simple example unless we provide a
dedicated utility.

It could however be something along those lines:

Example: archive_program %p /path/to/%d

archive_program being a script ensuring that all those requirements are met:
<the long list of the requirements>

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Julien Rouhaud (#12)

Re: Duplicate history file?

At Fri, 11 Jun 2021 10:48:32 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in

On Fri, Jun 11, 2021 at 11:25:51AM +0900, Kyotaro Horiguchi wrote:

Nevertheless I agree to it, still don't we need a minimum workable
setup as the first step? Something like below.

===
The following is an example of the minimal archive_command.

Example: cp %p /blah/%f

However, it is far from perfect. The following is the discussion about
what is needed for archive_command to be more reliable.

<the long list of the requirements>
====

"far from perfect" is a strong understatement for "appears to work but will
randomly and silently breaks everything without a simple way to detect it".

I think it's overstating. It sounds like a story of a mission critical
world. How perfect archive_command should be depends on the
requirements of every system. Simple cp is actualy sufficient in
certain log? range of usages, maybe.

We should document a minimum workable setup, but cp isn't an example of that,
and I don't think that there will be a simple example unless we provide a
dedicated utility.

It looks somewhat strange like "Well, you need a special track to
drive your car, however, we don't have one. It's your responsibility
to construct a track that protects it from accidents perfectly.".

"Yeah, I'm not going to push it so hard and don't care it gets some
small scratches, couldn't I drive it on a public road?"

(Sorry for the bad analogy).

I think cp can be an example as far as we explain the limitations. (On
the other hand "test !-f" cannot since it actually prevents server
from working correctly.)

It could however be something along those lines:

Example: archive_program %p /path/to/%d

archive_program being a script ensuring that all those requirements are met:
<the long list of the requirements>

Isn't it almost saying that anything less than pgBackRest isn't
qualified as archive_program?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Kyotaro Horiguchi (#13)

Re: Duplicate history file?

On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:

At Fri, 11 Jun 2021 10:48:32 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in

"far from perfect" is a strong understatement for "appears to work but will
randomly and silently breaks everything without a simple way to detect it".

Yeah. Users like unplugging their hosts, because that's *fast* and
easy to do.

I think it's overstating. It sounds like a story of a mission critical
world. How perfect archive_command should be depends on the
requirements of every system. Simple cp is actually sufficient in
certain log? range of usages, maybe.

We should document a minimum workable setup, but cp isn't an example of that,
and I don't think that there will be a simple example unless we provide a
dedicated utility.

I think cp can be an example as far as we explain the limitations. (On
the other hand "test !-f" cannot since it actually prevents server
from working correctly.)

Disagreed. I think that we should not try to change this area until
we can document a reliable solution, and a simple "cp" is not that.
Hmm. A simple command that could be used as reference is for example
"dd" that flushes the file by itself, or we could just revisit the
discussions about having a pg_copy command, or we could document a
small utility in perl that does the job.
--
Michael

rjuju123@gmail.com

over 4 years ago

In reply to: Kyotaro Horiguchi (#13)

Re: Duplicate history file?

On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:

At Fri, 11 Jun 2021 10:48:32 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in

"far from perfect" is a strong understatement for "appears to work but will
randomly and silently breaks everything without a simple way to detect it".

I think it's overstating. It sounds like a story of a mission critical
world. How perfect archive_command should be depends on the
requirements of every system. Simple cp is actualy sufficient in
certain log? range of usages, maybe.

I disagree, cp is probably the worst command that can be used for this purpose.
On top on the previous problems already mentioned, you also have the fact that
the copy isn't atomic. It means that any concurrent restore_command (or
anything that would consume the archived files) will happily process a half
copied WAL file, and in case of any error during the copy you end up with a
file for which you don't know if it contains valid data or not. I don't see
any case where you would actually want to use that, unless maybe if you want to
benchmark how long it takes before you lose some data.

We should document a minimum workable setup, but cp isn't an example of that,
and I don't think that there will be a simple example unless we provide a
dedicated utility.

It looks somewhat strange like "Well, you need a special track to
drive your car, however, we don't have one. It's your responsibility
to construct a track that protects it from accidents perfectly.".

"Yeah, I'm not going to push it so hard and don't care it gets some
small scratches, couldn't I drive it on a public road?"

(Sorry for the bad analogy).

I think that a better analogy would be "I don't need working brakes on my car
since I only drive on highway and there aren't any red light there".

Isn't it almost saying that anything less than pgBackRest isn't
qualified as archive_program?

I don't know, I'm assuming that barman also provides one, such as wal-e and
wal-g (assuming that the distant providers do their part of the job correctly).
Maybe there are other tools too. But as long as we don't document what exactly
are the requirements, it's not really a surprise that most people don't
implement them.

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Julien Rouhaud (#15)

Re: Duplicate history file?

At Fri, 11 Jun 2021 15:18:03 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in

On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:
I disagree, cp is probably the worst command that can be used for this purpose.
On top on the previous problems already mentioned, you also have the fact that
the copy isn't atomic. It means that any concurrent restore_command (or
anything that would consume the archived files) will happily process a half
copied WAL file, and in case of any error during the copy you end up with a
file for which you don't know if it contains valid data or not. I don't see
any case where you would actually want to use that, unless maybe if you want to
benchmark how long it takes before you lose some data.

Actually there's large room for losing data with cp. Yes, we would
need additional redundancy of storage and periodical integrity
inspection of the storage and archives on maybe need copies at the
other sites on the other side of the Earth. But they are too-much for
some kind of users. They have the right and responsibility to decide
how durable/reliable their archive needs to be. (Putting aside some
hardware/geological requirements :p) If we mandate some
characteristics on the archive_command, we should take them into core.
I remember I saw some discussions on archive command on this line but
I think it had ended at the point something like that "we cannot
design one-fits-all interface comforming the requirements" or
something (sorry, I don't remember in its detail..)

I don't know, I'm assuming that barman also provides one, such as wal-e and
wal-g (assuming that the distant providers do their part of the job correctly).

Well. rman used rsync/ssh in its documentation in the past and now
looks like providing barman-wal-archive so it seems that you're right
in that point. So, do we recommend them in our documentation? (I'm
not sure they are actually comform the requirement, though..)

Maybe there are other tools too. But as long as we don't document what exactly
are the requirements, it's not really a surprise that most people don't
implement them.

I strongly agree to describe the requirements.

My point is that if all of them are really mandatory, it is mandatory
for us to officially provide or at least recommend the minimal
implement(s) that covers all of them. If we recommended some external
tools, that would mean that we ensure that the tools qualify the
requirements.

If we write an example with a pseudo tool name, requiring some
characteristics on the tool, then not telling about the minimal tools,
I think that it is equivalent to that we are inhibiting certain users
from using archive_command even if they really don't want such level
of durability.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Michael Paquier (#14)

Re: Duplicate history file?

At Fri, 11 Jun 2021 16:08:33 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:

I think cp can be an example as far as we explain the limitations. (On
the other hand "test !-f" cannot since it actually prevents server
from working correctly.)

Disagreed. I think that we should not try to change this area until
we can document a reliable solution, and a simple "cp" is not that.

Isn't removing cp from the documentation a change in this area? I
basically agree to not to change anything but the current example
"test ! -f <fn> && cp .." and relevant description has been known to
be problematic in a certain situation.

- Do we leave it alone igonring the possible problem?

- Just add a description about "the problem"?

- Just remove "test ! -f" and the relevant description?

- Remove "test ! -f" and rewrite the relevant description?

(- or not remove "test ! -f" and rewrite the relevant description?)

- Write the full (known) requirements and use a pseudo tool-name in
the example?

- provide a minimal implement of the command?

- recommend some external tools (that we can guarantee that they
comform the requriements)?

- not recommend any tools?

Hmm. A simple command that could be used as reference is for example
"dd" that flushes the file by itself, or we could just revisit the
discussions about having a pg_copy command, or we could document a
small utility in perl that does the job.

I think we should do that if pg_copy comforms the mandatory
requirements but maybe it's in the future. Showing the minimal
implement in perl looks good.

However, my main point here is "what should we do for now?". Not
about an ideal solution.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

rjuju123@gmail.com

over 4 years ago

In reply to: Kyotaro Horiguchi (#16)

Re: Duplicate history file?

On Tue, Jun 15, 2021 at 10:20:37AM +0900, Kyotaro Horiguchi wrote:

Actually there's large room for losing data with cp. Yes, we would
need additional redundancy of storage and periodical integrity
inspection of the storage and archives on maybe need copies at the
other sites on the other side of the Earth. But they are too-much for
some kind of users. They have the right and responsibility to decide
how durable/reliable their archive needs to be. (Putting aside some
hardware/geological requirements :p)

Note that most of those considerations are orthogonal to what a proper
archive_command should be responsible for.

Yes users are responsible to decide they want valid and durable backup or
not, but we should assume a sensible default behavior, which is a valid and
durable archive_command. We don't document a default fsync = off with later
recommendation explaining why you shouldn't do that, and I think it should be
the same for the archive_command. The problem with the current documentation
is that many users will just blindly copy/paste whatever is in the
documentation without reading any further.

As an example, a few hours ago some french user on the french bulletin board
said that he fixed his "postmaster.pid already exists" error with a
pg_resetxlog -f, referring to some article explaining how to start postgres in
case of "PANIC: could not locate a valid checkpoint record". Arguably
that article didn't bother to document what are the implication for executing
pg_resetxlog, but given that the user original problem had literally nothing to
do with what was documented, I really doubt that it would have changed
anything.

If we mandate some
characteristics on the archive_command, we should take them into core.

I agree.

I remember I saw some discussions on archive command on this line but
I think it had ended at the point something like that "we cannot
design one-fits-all interface comforming the requirements" or
something (sorry, I don't remember in its detail..)

I also agree, but this problem is solved by making archive_command
customisable. Providing something that can reliably work in some general and
limited cases would be a huge improvement.

Well. rman used rsync/ssh in its documentation in the past and now
looks like providing barman-wal-archive so it seems that you're right
in that point. So, do we recommend them in our documentation? (I'm
not sure they are actually comform the requirement, though..)

We could maybe bless some third party backup solutions, but this will probably
lead to a lot more discussions, so it's better to discuss that in a different
thread. Note that I don't have a deep knowledge of any of those tools so I
don't have an opinion.

If we write an example with a pseudo tool name, requiring some
characteristics on the tool, then not telling about the minimal tools,
I think that it is equivalent to that we are inhibiting certain users
from using archive_command even if they really don't want such level
of durability.

I already saw customers complaining about losing backups because their
archive_command didn't ensure that the copy was durable. Some users may not
care about losing their backups in such case, but I really think that the
majority of users expect a backup to be valid, durable and everything without
even thinking that it may not be the case. It should be the default behavior,
not optional.

sfrost@snowman.net

over 4 years ago

In reply to: Kyotaro Horiguchi (#17)

Re: Duplicate history file?

Greetings,

* Kyotaro Horiguchi (horikyota.ntt@gmail.com) wrote:

At Fri, 11 Jun 2021 16:08:33 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:

I think cp can be an example as far as we explain the limitations. (On
the other hand "test !-f" cannot since it actually prevents server
from working correctly.)

Disagreed. I think that we should not try to change this area until
we can document a reliable solution, and a simple "cp" is not that.

Isn't removing cp from the documentation a change in this area? I
basically agree to not to change anything but the current example
"test ! -f <fn> && cp .." and relevant description has been known to
be problematic in a certain situation.

[...]

- Write the full (known) requirements and use a pseudo tool-name in
the example?

I'm generally in favor of just using a pseudo tool-name and then perhaps
providing a link to a new place on .Org where people can ask to have
their PG backup solution listed, or something along those lines.

- provide a minimal implement of the command?

Having been down this road for a rather long time, I can't accept this
as a serious suggestion. No, not even with Perl. Been there, done
that, not going back.

- recommend some external tools (that we can guarantee that they
comform the requriements)?

The requirements are things which are learned over years and changes
over time. Trying to document them and keep up with them would be a
pretty serious project all on its own. There are external projects who
spend serious time and energy doing their best to provide the tooling
needed here and we should be promoting those, not trying to pretend like
this is a simple thing which anyone could write a short perl script to
accomplish.

- not recommend any tools?

This is the approach that has been tried and it's, objectively, failed
miserably. Our users are ending up with invalid and unusable backups,
corrupted WAL segments, inability to use PITR, and various other issues
because we've been trying to pretend that this isn't a hard problem. We
really need to stop that and accept that it's hard and promote the tools
which have been explicitly written to address that hard problem.

Hmm. A simple command that could be used as reference is for example
"dd" that flushes the file by itself, or we could just revisit the
discussions about having a pg_copy command, or we could document a
small utility in perl that does the job.

I think we should do that if pg_copy comforms the mandatory
requirements but maybe it's in the future. Showing the minimal
implement in perl looks good.

Already tried doing it in perl. No, it's not simple and it's also
entirely vaporware today and implies that we're going to develop this
tool, improve it in the future as we realize it needs to be improved,
and maintain it as part of core forever. If we want to actually adopt
and pull in a backup tool to be part of core then we should talk about
things which actually exist, such as the various existing projects that
have been written to specifically work to address all the requirements
which are understood today, not say "well, we can just write a simple
perl script to do it" because it's not actually that simple.

Providing yet another half solution would be doubling-down on the failed
approach to document a "simple" solution and would be a disservice to
our users.

Thanks,

Stephen

rjuju123@gmail.com

over 4 years ago

In reply to: Stephen Frost (#19)

Re: Duplicate history file?

On Tue, Jun 15, 2021 at 11:33:10AM -0400, Stephen Frost wrote:

The requirements are things which are learned over years and changes
over time. Trying to document them and keep up with them would be a
pretty serious project all on its own. There are external projects who
spend serious time and energy doing their best to provide the tooling
needed here and we should be promoting those, not trying to pretend like
this is a simple thing which anyone could write a short perl script to
accomplish.

The fact that this is such a complex problem is the very reason why we should
spend a lot of energy documenting the various requirements. Otherwise, how
could anyone implement a valid program for that and how could anyone validate
that a solution claiming to do its job actually does its job?

Already tried doing it in perl. No, it's not simple and it's also
entirely vaporware today and implies that we're going to develop this
tool, improve it in the future as we realize it needs to be improved,
and maintain it as part of core forever. If we want to actually adopt
and pull in a backup tool to be part of core then we should talk about
things which actually exist, such as the various existing projects that
have been written to specifically work to address all the requirements
which are understood today, not say "well, we can just write a simple
perl script to do it" because it's not actually that simple.

Adopting a full backup solution seems like a bit extreme. On the other hand,
having some real core implementation of an archive_command for the most general
use cases (local copy, distant copy over ssh...) could make sense. This would
remove that burden for some, probably most, of the 3rd party backup tools, and
would also ensure that the various requirements are properly documented since
it would be the implementation reference.

sfrost@snowman.net

over 4 years ago

In reply to: Julien Rouhaud (#20)

Re: Duplicate history file?

Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Tue, Jun 15, 2021 at 11:33:10AM -0400, Stephen Frost wrote:

The requirements are things which are learned over years and changes
over time. Trying to document them and keep up with them would be a
pretty serious project all on its own. There are external projects who
spend serious time and energy doing their best to provide the tooling
needed here and we should be promoting those, not trying to pretend like
this is a simple thing which anyone could write a short perl script to
accomplish.

The fact that this is such a complex problem is the very reason why we should
spend a lot of energy documenting the various requirements. Otherwise, how
could anyone implement a valid program for that and how could anyone validate
that a solution claiming to do its job actually does its job?

Reading the code.

Already tried doing it in perl. No, it's not simple and it's also
entirely vaporware today and implies that we're going to develop this
tool, improve it in the future as we realize it needs to be improved,
and maintain it as part of core forever. If we want to actually adopt
and pull in a backup tool to be part of core then we should talk about
things which actually exist, such as the various existing projects that
have been written to specifically work to address all the requirements
which are understood today, not say "well, we can just write a simple
perl script to do it" because it's not actually that simple.

Adopting a full backup solution seems like a bit extreme. On the other hand,
having some real core implementation of an archive_command for the most general
use cases (local copy, distant copy over ssh...) could make sense. This would
remove that burden for some, probably most, of the 3rd party backup tools, and
would also ensure that the various requirements are properly documented since
it would be the implementation reference.

Having a database platform that hasn't got a full backup solution is a
pretty awkward position to be in.

I'd like to see something a bit more specific than handwaving about how
core could provide something in this area which would remove the burden
from other tools. Would also be good to know who is going to write that
and maintain it.

Thanks,

Stephen

rjuju123@gmail.com

over 4 years ago

In reply to: Stephen Frost (#21)

Re: Duplicate history file?

On Tue, Jun 15, 2021 at 02:28:04PM -0400, Stephen Frost wrote:

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Tue, Jun 15, 2021 at 11:33:10AM -0400, Stephen Frost wrote:

The fact that this is such a complex problem is the very reason why we should
spend a lot of energy documenting the various requirements. Otherwise, how
could anyone implement a valid program for that and how could anyone validate
that a solution claiming to do its job actually does its job?

Reading the code.

Oh, if it's as simple as that then surely documenting the various requirements
won't be an issue.

sfrost@snowman.net

over 4 years ago

In reply to: Julien Rouhaud (#22)

Re: Duplicate history file?

Greetings,

On Tue, Jun 15, 2021 at 21:11 Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Jun 15, 2021 at 02:28:04PM -0400, Stephen Frost wrote:

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Tue, Jun 15, 2021 at 11:33:10AM -0400, Stephen Frost wrote:

The fact that this is such a complex problem is the very reason why we

should

spend a lot of energy documenting the various requirements.

Otherwise, how

could anyone implement a valid program for that and how could anyone

validate

that a solution claiming to do its job actually does its job?

Reading the code.

Oh, if it's as simple as that then surely documenting the various
requirements
won't be an issue.

As I suggested previously- this is similar to the hooks that we provide. We
don’t extensively document them because if you’re writing an extension
which uses a hook, you’re going to be (or should be..) reading the code too.

Consider that, really, an archive command should refuse to allow archiving
of WAL on a timeline which doesn’t have a corresponding history file in the
archive for that timeline (excluding timeline 1). Also, a backup tool
should compare the result of pg_start_backup to what’s in the control file,
using a fresh read, after start backup returns to make sure that the
storage is sane and not, say, cache’ing pages independently (such as might
happen with a separate NFS mount..). Oh, and if a replica is involved, a
check should be done to see if the replica has changed timelines and an
appropriate message thrown if that happens complaining that the backup was
aborted due to the promotion of the replica…

To be clear- these aren’t checks that pgbackrest has today and I’m not
trying to make it out as if pgbackrest is the only solution and the only
tool that “does everything and is correct” because we aren’t there yet and
I’m not sure we ever will be “all correct” or “done”.

These, however, are ones we have planned to add because of things we’ve
seen and thought of, most of them in just the past few months.

Thanks,

Stephen

Show quoted text

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Stephen Frost (#19)

Re: Duplicate history file?

Thanks for the opinions.

At Tue, 15 Jun 2021 11:33:10 -0400, Stephen Frost <sfrost@snowman.net> wrote in

Greetings,

* Kyotaro Horiguchi (horikyota.ntt@gmail.com) wrote:

At Fri, 11 Jun 2021 16:08:33 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Fri, Jun 11, 2021 at 03:32:28PM +0900, Kyotaro Horiguchi wrote:

I think cp can be an example as far as we explain the limitations. (On
the other hand "test !-f" cannot since it actually prevents server
from working correctly.)

Disagreed. I think that we should not try to change this area until
we can document a reliable solution, and a simple "cp" is not that.

Isn't removing cp from the documentation a change in this area? I
basically agree to not to change anything but the current example
"test ! -f <fn> && cp .." and relevant description has been known to
be problematic in a certain situation.

[...]

- Write the full (known) requirements and use a pseudo tool-name in
the example?

I'm generally in favor of just using a pseudo tool-name and then perhaps
providing a link to a new place on .Org where people can ask to have
their PG backup solution listed, or something along those lines.

Looks fine.

- provide a minimal implement of the command?

Having been down this road for a rather long time, I can't accept this
as a serious suggestion. No, not even with Perl. Been there, done
that, not going back.

- recommend some external tools (that we can guarantee that they
comform the requriements)?

The requirements are things which are learned over years and changes
over time. Trying to document them and keep up with them would be a
pretty serious project all on its own. There are external projects who
spend serious time and energy doing their best to provide the tooling
needed here and we should be promoting those, not trying to pretend like
this is a simple thing which anyone could write a short perl script to
accomplish.

I agree that no simple solution could be really perfect. The reason I
think that a simple cp can be a candidate of the example might be
based on the assumption that anyone who is going to build a database
system ought to know their requirements including the
durability/reliability of archives/backups and the limitaions of
adopted methods/technologies. However, as Julien mentioned, if
there's actually a problem that relatively.. ahem, ill-advised users
(sorry in advance if it's rude) uses the 'cp' only for the reason that
it is shown in the example without a thought and inadvertently loses
archives, it might be better that we don't suggest a concrete command
for archive_command.

- not recommend any tools?

This is the approach that has been tried and it's, objectively, failed
miserably. Our users are ending up with invalid and unusable backups,
corrupted WAL segments, inability to use PITR, and various other issues
because we've been trying to pretend that this isn't a hard problem. We
really need to stop that and accept that it's hard and promote the tools
which have been explicitly written to address that hard problem.

I can sympathize that but is there any difference with system backups?
One can just copy $HOME to another directory in the same drive then
call it a day. Another uses dd to make a image backup. Others need
durability or guarantee for integrity or even encryption so acquire or
purchase a tool that conforms their requirements. Or someone creates
their own backup solution that meets their requirements.

On the other hand, what OS distributors offer a long list for
requirements or a recipe for perfect backups? (Yeah, I'm saying this
based on nothing, just from a prejudice.)

If the system is serious, who don't know enough about backup ought to
consult professionals before building an inadequate backup system and
lose their data.

Hmm. A simple command that could be used as reference is for example
"dd" that flushes the file by itself, or we could just revisit the
discussions about having a pg_copy command, or we could document a
small utility in perl that does the job.

I think we should do that if pg_copy comforms the mandatory
requirements but maybe it's in the future. Showing the minimal
implement in perl looks good.

Already tried doing it in perl. No, it's not simple and it's also
entirely vaporware today and implies that we're going to develop this
tool, improve it in the future as we realize it needs to be improved,
and maintain it as part of core forever. If we want to actually adopt
and pull in a backup tool to be part of core then we should talk about
things which actually exist, such as the various existing projects that
have been written to specifically work to address all the requirements
which are understood today, not say "well, we can just write a simple
perl script to do it" because it's not actually that simple.

Providing yet another half solution would be doubling-down on the failed
approach to document a "simple" solution and would be a disservice to
our users.

Ok, if we follow the direction that we are responsible for ensuring
that every user has reliable backups, I don't come up with proper
description about that.

We could list several "requirement" like "do sync after copy", "take a
checksum for all files then check it periodically" or other things but
what is more important things to list here, I think, is "how we run
the archive_command".

Doesn't the following work for now?

(No example)

- "%f is replace by ... %p is .., %r is ... in archive_command"

- We call the archive_command for every wal segment which is finished.

- We may call the archive_command for the same file more than once.

- We may call the archive_command for different files with the same
name. In this case server is working incorrectly and need a
check. Don't overwrite with the new content.

- We don't offer any durability or integrity on the archived
files. All of them is up to you. You can use some existing
solutions for archiving. See the following links.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Kyotaro Horiguchi (#24)

Re: Duplicate history file?

At Wed, 16 Jun 2021 12:04:03 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

Ok, if we follow the direction that we are responsible for ensuring
that every user has reliable backups, I don't come up with proper
description about that.

We could list several "requirement" like "do sync after copy", "take a
checksum for all files then check it periodically" or other things but
what is more important things to list here, I think, is "how we run
the archive_command".

Doesn't the following work for now?

(No example)

- "%f is replace by ... %p is .., %r is ... in archive_command"

- We call the archive_command for every wal segment which is finished.

- We may call the archive_command for the same file more than once.

- We may call the archive_command for different files with the same
name. In this case server is working incorrectly and need a
check. Don't overwrite with the new content.

- We don't offer any durability or integrity on the archived
files. All of them is up to you. You can use some existing
solutions for archiving. See the following links.

Of course, there should be some descriptions about error handling
along with.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

rjuju123@gmail.com

over 4 years ago

In reply to: Stephen Frost (#23)

Re: Duplicate history file?

On Tue, Jun 15, 2021 at 11:00:57PM -0400, Stephen Frost wrote:

As I suggested previously- this is similar to the hooks that we provide. We
don’t extensively document them because if you’re writing an extension
which uses a hook, you’re going to be (or should be..) reading the code too.

I disagree, hooks allows developers to implement some new or additional
behavior which by definition can't be documented. And it's also relying on the
C api which by definition allows you to do anything with the server. There are
also of course some requirements but they're quite obvious (like a planner_hook
should return a valid plan and such).

On the other hand the archive_command is there to do only one clear thing:
safely backup a WAL file. And I don't think that what makes that backup "safe"
is open to discussion. Sure, you can chose to ignore some of it if you think
you can afford to do it, but it doesn't change the fact that it's still a
requirement which should be documented.

Consider that, really, an archive command should refuse to allow archiving
of WAL on a timeline which doesn’t have a corresponding history file in the
archive for that timeline (excluding timeline 1).

Yes, that's a clear requirement that should be documented.

Also, a backup tool
should compare the result of pg_start_backup to what’s in the control file,
using a fresh read, after start backup returns to make sure that the
storage is sane and not, say, cache’ing pages independently (such as might
happen with a separate NFS mount..). Oh, and if a replica is involved, a
check should be done to see if the replica has changed timelines and an
appropriate message thrown if that happens complaining that the backup was
aborted due to the promotion of the replica…

I agree, but unless I'm missing something it's unrelated to what an
archive_command should be in charge of?

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Julien Rouhaud (#26)

Re: Duplicate history file?

At Wed, 16 Jun 2021 11:20:55 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in

On Tue, Jun 15, 2021 at 11:00:57PM -0400, Stephen Frost wrote:

As I suggested previously- this is similar to the hooks that we provide. We
don’t extensively document them because if you’re writing an extension
which uses a hook, you’re going to be (or should be..) reading the code too.

I disagree, hooks allows developers to implement some new or additional
behavior which by definition can't be documented. And it's also relying on the
C api which by definition allows you to do anything with the server. There are
also of course some requirements but they're quite obvious (like a planner_hook
should return a valid plan and such).

On the other hand the archive_command is there to do only one clear thing:
safely backup a WAL file. And I don't think that what makes that backup "safe"
is open to discussion. Sure, you can chose to ignore some of it if you think
you can afford to do it, but it doesn't change the fact that it's still a
requirement which should be documented.

I agree to Julien, however, I want to discuss (also) on what to do for
14 now. If we decide not to touch the document for the version. that
discussion would end. What do you think about that? I think it's
impossible to write the full-document for the requirements *for 14*.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

rjuju123@gmail.com

over 4 years ago

In reply to: Kyotaro Horiguchi (#27)

Re: Duplicate history file?

On Wed, Jun 16, 2021 at 01:10:16PM +0900, Kyotaro Horiguchi wrote:

I agree to Julien, however, I want to discuss (also) on what to do for
14 now. If we decide not to touch the document for the version. that
discussion would end. What do you think about that? I think it's
impossible to write the full-document for the requirements *for 14*.

My personal take on that is that this is a bug in the documentation and the
list of requirements should be backported. Hopefully this can be done before
v14 is released, but if not I don't think that it should be a blocker to make
progress.

sfrost@snowman.net

over 4 years ago

In reply to: Julien Rouhaud (#26)

Re: Duplicate history file?

Greetings,

On Tue, Jun 15, 2021 at 23:21 Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Jun 15, 2021 at 11:00:57PM -0400, Stephen Frost wrote:

As I suggested previously- this is similar to the hooks that we provide.

We

don’t extensively document them because if you’re writing an extension
which uses a hook, you’re going to be (or should be..) reading the code

too.

I disagree, hooks allows developers to implement some new or additional
behavior which by definition can't be documented. And it's also relying
on the
C api which by definition allows you to do anything with the server.
There are
also of course some requirements but they're quite obvious (like a
planner_hook
should return a valid plan and such).

The archive command is technically invoked using the shell, but the
interpretation of the exit code, for example, is only discussed in the C
code, but it’s far from the only consideration that someone developing an
archive command needs to understand.

On the other hand the archive_command is there to do only one clear thing:

safely backup a WAL file. And I don't think that what makes that backup
"safe"
is open to discussion. Sure, you can chose to ignore some of it if you
think
you can afford to do it, but it doesn't change the fact that it's still a
requirement which should be documented.

The notion that an archive command can be distanced from backups is really
not reasonable in my opinion.

Consider that, really, an archive command should refuse to allow archiving

of WAL on a timeline which doesn’t have a corresponding history file in

the

archive for that timeline (excluding timeline 1).

Yes, that's a clear requirement that should be documented.

Is it a clear requirement that pgbackrest and every other organization that
has developed an archive command has missed? Are you able to point to a
tool which has such a check today?

This is not a trivial problem any more than PG’s use of fsync is trivial
and we clearly should have understood how Linux and fsync work decades ago
and made sure to always crash on any fsync failure and not believe that a
later fsync would return a failure if an earlier one did and the problem
didn’t resolve itself properly.

Also, a backup tool

should compare the result of pg_start_backup to what’s in the control

file,

using a fresh read, after start backup returns to make sure that the
storage is sane and not, say, cache’ing pages independently (such as

might

happen with a separate NFS mount..). Oh, and if a replica is involved, a
check should be done to see if the replica has changed timelines and an
appropriate message thrown if that happens complaining that the backup

was

aborted due to the promotion of the replica…

I agree, but unless I'm missing something it's unrelated to what an
archive_command should be in charge of?

I’m certainly not moved by this argument as it seems to be willfully
missing the point. Further, if we are going to claim that we must document
archive command to such level then surely we need to also document all the
requirements for pg_start_backup and pg_stop_backup too, so this strikes me
as entirely relevant.

Thanks,

Stephen

Show quoted text

rjuju123@gmail.com

over 4 years ago

In reply to: Stephen Frost (#29)

Re: Duplicate history file?

On Wed, Jun 16, 2021 at 01:17:11AM -0400, Stephen Frost wrote:

The archive command is technically invoked using the shell, but the
interpretation of the exit code, for example, is only discussed in the C
code, but it’s far from the only consideration that someone developing an
archive command needs to understand.

The expectations for the return code are documented. There are some subtleties
for when the command is interrupted by a signal, which are also documented.
Why shouldn't we document the rest of the expectation of what such a command
should do?

The notion that an archive command can be distanced from backups is really
not reasonable in my opinion.

As far as I know you can use archiving for replication purpose only. In such
case you certainly will have different usage of the archived files compared to
backups, and different verifications. But the requirements on what makes an
archive_command safe won't change.

Consider that, really, an archive command should refuse to allow archiving

of WAL on a timeline which doesn’t have a corresponding history file in

the

archive for that timeline (excluding timeline 1).

Yes, that's a clear requirement that should be documented.

Is it a clear requirement that pgbackrest and every other organization that
has developed an archive command has missed? Are you able to point to a
tool which has such a check today?

I don't know, as I don't have any knowledge of what barman, BART, pgbackrest,
pg_probackup or any other backup solution does in any detail. I was only saying
that what you said makes sense and should be part of the documentation,
assuming that this is indeed a requirement.

Also, a backup tool

should compare the result of pg_start_backup to what’s in the control

file,

using a fresh read, after start backup returns to make sure that the
storage is sane and not, say, cache’ing pages independently (such as

might

happen with a separate NFS mount..). Oh, and if a replica is involved, a
check should be done to see if the replica has changed timelines and an
appropriate message thrown if that happens complaining that the backup

was

aborted due to the promotion of the replica…

I agree, but unless I'm missing something it's unrelated to what an
archive_command should be in charge of?

I’m certainly not moved by this argument as it seems to be willfully
missing the point. Further, if we are going to claim that we must document
archive command to such level then surely we need to also document all the
requirements for pg_start_backup and pg_stop_backup too, so this strikes me
as entirely relevant.

So what was the point? I'm not saying that doing backup is trivial and/or
should not be properly documented, nor that we shouldn't improve
pg_start_backup or pg_stop_backup documentation, I'm just saying that those
doesn't change what makes an archive_command safe.

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 4 years ago

In reply to: Julien Rouhaud (#28)

Re: Duplicate history file?

At Wed, 16 Jun 2021 12:36:19 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in

On Wed, Jun 16, 2021 at 01:10:16PM +0900, Kyotaro Horiguchi wrote:

I agree to Julien, however, I want to discuss (also) on what to do for
14 now. If we decide not to touch the document for the version. that
discussion would end. What do you think about that? I think it's
impossible to write the full-document for the requirements *for 14*.

My personal take on that is that this is a bug in the documentation and the
list of requirements should be backported. Hopefully this can be done before
v14 is released, but if not I don't think that it should be a blocker to make
progress.

I understand that, we discuss how we fix the documentation and we
don't change it for the version 14 if any conclusion cannot be made
until the deadline.

Thanks!

--
Kyotaro Horiguchi
NTT Open Source Software Center

sfrost@snowman.net

over 4 years ago

In reply to: Julien Rouhaud (#30)

Re: Duplicate history file?

Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Wed, Jun 16, 2021 at 01:17:11AM -0400, Stephen Frost wrote:

Consider that, really, an archive command should refuse to allow archiving

of WAL on a timeline which doesn’t have a corresponding history file in

the

archive for that timeline (excluding timeline 1).

Yes, that's a clear requirement that should be documented.

Is it a clear requirement that pgbackrest and every other organization that
has developed an archive command has missed? Are you able to point to a
tool which has such a check today?

I don't know, as I don't have any knowledge of what barman, BART, pgbackrest,
pg_probackup or any other backup solution does in any detail. I was only saying
that what you said makes sense and should be part of the documentation,
assuming that this is indeed a requirement.

This is exactly it. I don't agree that we can, or should, treat every
sensible thing that we realize about what the archive command or the
backup tool should be doing as some bug in our documentation that has to
be backpatched.

If you're serious about continuing on this path, it strikes me that the
next step would be to go review all of the above mentioned tools,
identify all of the things that they do and the checks that they have,
and then craft a documentation patch to add all of those- for both
archive command and pg_start/stop_backup.

I don't think it'd be as big as the rest of the PG documentation, but
I'm not sure.

Thanks,

Stephen

rjuju123@gmail.com

over 4 years ago

In reply to: Stephen Frost (#32)

Re: Duplicate history file?

On Wed, Jun 16, 2021 at 9:19 PM Stephen Frost <sfrost@snowman.net> wrote:

This is exactly it. I don't agree that we can, or should, treat every
sensible thing that we realize about what the archive command or the
backup tool should be doing as some bug in our documentation that has to
be backpatched.
If you're serious about continuing on this path, it strikes me that the
next step would be to go review all of the above mentioned tools,
identify all of the things that they do and the checks that they have,
and then craft a documentation patch to add all of those- for both
archive command and pg_start/stop_backup.

1) I'm not saying that every single check that every single tools
currently does is a requirement for a safe command and/or should be
documented
2) I don't think that there are thousands and thousands of
requirements, as you seem to imply
3) I still don't understand why you think that having a partial
knowledge of what makes an archive_command safe scattered in the
source code of many third party tools is a good thing

But what better alternative are you suggesting? Say that no ones
knows what an archive_command should do and let people put a link to
their backup solution in the hope that they will eventually converge
to a safe solution that no one will be able to validate?

sfrost@snowman.net

over 4 years ago

In reply to: Julien Rouhaud (#33)

Re: Duplicate history file?

Greetings,

* Julien Rouhaud (rjuju123@gmail.com) wrote:

On Wed, Jun 16, 2021 at 9:19 PM Stephen Frost <sfrost@snowman.net> wrote:

This is exactly it. I don't agree that we can, or should, treat every
sensible thing that we realize about what the archive command or the
backup tool should be doing as some bug in our documentation that has to
be backpatched.
If you're serious about continuing on this path, it strikes me that the
next step would be to go review all of the above mentioned tools,
identify all of the things that they do and the checks that they have,
and then craft a documentation patch to add all of those- for both
archive command and pg_start/stop_backup.

1) I'm not saying that every single check that every single tools
currently does is a requirement for a safe command and/or should be
documented

That's true- you're agreeing that there's even checks beyond those that
are currently implemented which should also be done. That's exactly
what I was responding to.

2) I don't think that there are thousands and thousands of
requirements, as you seem to imply

You've not reviewed any of the tools which have been written and so I'm
not sure what you're basing your belief on. I've done reviews of the
various tools and have been rather involved in the development of one of
them. I do think there's lots of requirements and it's not some static
list which could be just written down once and then never touched or
thought about again.

Consider pg_dump- do we document everything that a logical export tool
should do? That someone who wants to implement pg_dump should make sure
that the tool runs around and takes out a share lock on all of the
tables to be exported? No, of course we don't, because we provide a
tool to do that and if people actually want to understand how it works,
we point them to the source code. Had we started out with a backup tool
in core, the same would be true for that. Instead, we didn't, and such
tools were developed outside of core (and frankly have largely had to
play catch-up to try and figure out all the things that are needed to do
it well and likely always will be since they aren't part of core).

3) I still don't understand why you think that having a partial
knowledge of what makes an archive_command safe scattered in the
source code of many third party tools is a good thing

Having partial knowledge of what makes an archive_command safe in the
official documentation is somehow better..? What would that lead to-
other people seriously developing a backup solution for PG? No, I
seriously doubt that, as those who are seriously developing such
solutions couldn't trust to only what we've got documented anyway but
would have to go looking through the source code and would need to
develop a deep understanding of how WAL works, what happens when PG is
started up to perform PITR but with archiving disabled and how that
impacts what ends up being archived (hint: the server will switch
timelines but won't actually archive a history file because archiving is
disabled- a restart which then enables archiving will then start pushing
WAL on a timeline where there's no history file; do that twice from an
older backup and not you've got the same WAL files trying to be pushed
into the repo which are actually on materially different timelines even
though the same timeline has been chosen multiple times...), how
timelines work, and all the rest.

We already have partial documentation about what should go into
developing an archive_command and what it's lead to are people ignoring
that and instead copying the example that's explicitly called out as not
sufficient. That's the actual problem that needs to be addressed here.

Let's rip out the example and instead promote tools which have been
written to specifically address this and which are actively maintained.
If someone actually comes asking about how to develop their own backup
solution for PG, we should suggest that they review the PG code related
to WAL, timelines, how promotion works, etc, and probably point them at
the OSS projects which already work to tackle this issue, because to
develop a proper tool you need to actually understand all of that.

But what better alternative are you suggesting? Say that no ones
knows what an archive_command should do and let people put a link to
their backup solution in the hope that they will eventually converge
to a safe solution that no one will be able to validate?

There are people who do know, today, what an archive command should do
and we should be promoting the tools that they've developed which do, in
fact, implement those checks already, at least the ones we've thought of
so far.

Instead, the suggestion being made here is to write a detailed design
document for how to develop a backup tool (and, no, I don't agree that
we can "just" focus on archive command) for PG and then to maintain it
and update it and backpatch every change to it that we think of.

Thanks,

Stephen