Strange path from pgarch_readyXlog()

Started by Thomas Munroover 4 years ago6 messageshackers
Jump to latest
#1Thomas Munro
thomas.munro@gmail.com

Hi,

Isn't this a corrupted pathname?

2021-12-29 03:39:55.708 CST [79851:1] WARNING: removal of orphan
archive status file
"pg_wal/archive_status/000000010000000000000003.00000028.backup000000010000000000000004.ready"
failed too many times, will try again later

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2021-12-29%2009%3A20%3A54

#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Thomas Munro (#1)
Re: Strange path from pgarch_readyXlog()

On 12/29/21, 12:22 PM, "Thomas Munro" <thomas.munro@gmail.com> wrote:

Isn't this a corrupted pathname?

2021-12-29 03:39:55.708 CST [79851:1] WARNING: removal of orphan
archive status file
"pg_wal/archive_status/000000010000000000000003.00000028.backup000000010000000000000004.ready"
failed too many times, will try again later

I bet this was a simple mistake in beb4e9b.

Nathan

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 434939be9b..b5b0d4e12f 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -113,7 +113,7 @@ static PgArchData *PgArch = NULL;
  * is empty, at which point another directory scan must be performed.
  */
 static binaryheap *arch_heap = NULL;
-static char arch_filenames[NUM_FILES_PER_DIRECTORY_SCAN][MAX_XFN_CHARS];
+static char arch_filenames[NUM_FILES_PER_DIRECTORY_SCAN][MAX_XFN_CHARS + 1];
 static char *arch_files[NUM_FILES_PER_DIRECTORY_SCAN];
 static int arch_files_size = 0;
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#2)
Re: Strange path from pgarch_readyXlog()

"Bossart, Nathan" <bossartn@amazon.com> writes:

I bet this was a simple mistake in beb4e9b.

-static char arch_filenames[NUM_FILES_PER_DIRECTORY_SCAN][MAX_XFN_CHARS];
+static char arch_filenames[NUM_FILES_PER_DIRECTORY_SCAN][MAX_XFN_CHARS + 1];

Hm, yeah, that looks like a pretty obvious bug.

While we're here, I wonder if we ought to get rid of the static-ness of
these arrays. I realize that they're only eating a few kB, but they're
doing so in every postgres process, when they'll only be used in the
archiver.

regards, tom lane

#4Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#3)
Re: Strange path from pgarch_readyXlog()

On 12/29/21, 1:04 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

"Bossart, Nathan" <bossartn@amazon.com> writes:

I bet this was a simple mistake in beb4e9b.

-static char arch_filenames[NUM_FILES_PER_DIRECTORY_SCAN][MAX_XFN_CHARS];
+static char arch_filenames[NUM_FILES_PER_DIRECTORY_SCAN][MAX_XFN_CHARS + 1];

Hm, yeah, that looks like a pretty obvious bug.

While we're here, I wonder if we ought to get rid of the static-ness of
these arrays. I realize that they're only eating a few kB, but they're
doing so in every postgres process, when they'll only be used in the
archiver.

This crossed my mind, too. I also think one of the arrays can be
eliminated in favor of just using the heap (after rebuilding with a
reversed comparator). Here is a minimally-tested patch that
demonstrates what I'm thinking.

Nathan

Attachments:

improve-beb4e9b.patchapplication/octet-stream; name=improve-beb4e9b.patchDownload+47-36
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#4)
Re: Strange path from pgarch_readyXlog()

"Bossart, Nathan" <bossartn@amazon.com> writes:

On 12/29/21, 1:04 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

While we're here, I wonder if we ought to get rid of the static-ness of
these arrays. I realize that they're only eating a few kB, but they're
doing so in every postgres process, when they'll only be used in the
archiver.

This crossed my mind, too. I also think one of the arrays can be
eliminated in favor of just using the heap (after rebuilding with a
reversed comparator). Here is a minimally-tested patch that
demonstrates what I'm thinking.

I already pushed a patch that de-static-izes those arrays, so this
needs rebased at least. However, now that you mention it it does
seem like maybe the intermediate arch_files[] array could be dropped
in favor of just pulling the next file from the heap.

The need to reverse the heap's sort order seems like a problem
though. I really dislike the kluge you used here with a static flag
that inverts the comparator's sort order behind the back of the
binary-heap mechanism. It seems quite accidental that that doesn't
fall foul of asserts or optimizations in binaryheap.c. For
instance, if binaryheap_build decided it needn't do anything when
bh_has_heap_property is already true, this code would fail. In any
case, we'd need to spend O(n) time inverting the heap's sort order,
so this'd likely be slower than the current code.

On the whole I'm inclined not to bother trying to optimize this
further. The main thing that concerned me is that somebody would
bump up NUM_FILES_PER_DIRECTORY_SCAN to the point where the static
space consumption becomes really problematic, and we've fixed that.

regards, tom lane

#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#5)
Re: Strange path from pgarch_readyXlog()

On 12/29/21, 3:11 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

"Bossart, Nathan" <bossartn@amazon.com> writes:

This crossed my mind, too. I also think one of the arrays can be
eliminated in favor of just using the heap (after rebuilding with a
reversed comparator). Here is a minimally-tested patch that
demonstrates what I'm thinking.

I already pushed a patch that de-static-izes those arrays, so this
needs rebased at least. However, now that you mention it it does
seem like maybe the intermediate arch_files[] array could be dropped
in favor of just pulling the next file from the heap.

The need to reverse the heap's sort order seems like a problem
though. I really dislike the kluge you used here with a static flag
that inverts the comparator's sort order behind the back of the
binary-heap mechanism. It seems quite accidental that that doesn't
fall foul of asserts or optimizations in binaryheap.c. For
instance, if binaryheap_build decided it needn't do anything when
bh_has_heap_property is already true, this code would fail. In any
case, we'd need to spend O(n) time inverting the heap's sort order,
so this'd likely be slower than the current code.

On the whole I'm inclined not to bother trying to optimize this
further. The main thing that concerned me is that somebody would
bump up NUM_FILES_PER_DIRECTORY_SCAN to the point where the static
space consumption becomes really problematic, and we've fixed that.

Your assessment seems reasonable to me. If there was a better way to
adjust the comparator for the heap, maybe there would be a stronger
case for this approach. I certainly don't think it's worth inventing
something for just this use-case.

Thanks for fixing this!

Nathan