Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Hi,
Currently the server is erroring out when unable to remove/parse a
logical rewrite file in CheckPointLogicalRewriteHeap wasting the
amount of work the checkpoint has done and preventing the checkpoint
from finishing. This is unlike CheckPointSnapBuild does for snapshot
files i.e. it just emits a message at LOG level and continues if it is
unable to parse or remove the file. Attaching a small patch applying
the same idea to the mapping files.
Thoughts?
Regards,
Bharath Rupireddy.
Attachments:
v1-0001-Avoid-erroring-out-when-unable-to-remove-or-parse.patchapplication/octet-stream; name=v1-0001-Avoid-erroring-out-when-unable-to-remove-or-parse.patchDownload
From 0d088ed4c637c6417b665597aece10c4eafd883e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 31 Dec 2021 12:20:05 +0000
Subject: [PATCH v1] Avoid erroring out when unable to remove or parse logical
rewrite files to save checkpoint work
Currently the server is erroring out when unable to remove/parse a
logical rewrite file in CheckPointLogicalRewriteHeap wasting the
amount of work the checkpoint has done and preventing the checkpoint
from finishing. This is unlike CheckPointSnapBuild does for snapshot
files i.e. it just emits a message at LOG level and continues if it
is unable to parse or remove the file. This patch applies the same
idea to the mapping files.
---
src/backend/access/heap/rewriteheap.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 986a776bbd..a53e89210a 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1234,19 +1234,36 @@ CheckPointLogicalRewriteHeap(void)
if (strncmp(mapping_de->d_name, "map-", 4) != 0)
continue;
+ /*
+ * We just log a message if a file doesn't fit the pattern, it's
+ * probably some editors lock/state file or similar...
+ */
if (sscanf(mapping_de->d_name, LOGICAL_REWRITE_FORMAT,
&dboid, &relid, &hi, &lo, &rewrite_xid, &create_xid) != 6)
- elog(ERROR, "could not parse filename \"%s\"", mapping_de->d_name);
+ {
+ ereport(LOG,
+ (errmsg("could not parse file name \"%s\"", mapping_de->d_name)));
+ continue;
+ }
lsn = ((uint64) hi) << 32 | lo;
if (lsn < cutoff || cutoff == InvalidXLogRecPtr)
{
elog(DEBUG1, "removing logical rewrite file \"%s\"", path);
+
+ /*
+ * It's not particularly harmful, though strange, if we can't
+ * remove the file here. Don't prevent the checkpoint from
+ * completing, that'd be a cure worse than the disease.
+ */
if (unlink(path) < 0)
- ereport(ERROR,
+ {
+ ereport(LOG,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m", path)));
+ continue;
+ }
}
else
{
--
2.25.1
On 12/31/21, 4:44 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Currently the server is erroring out when unable to remove/parse a
logical rewrite file in CheckPointLogicalRewriteHeap wasting the
amount of work the checkpoint has done and preventing the checkpoint
from finishing. This is unlike CheckPointSnapBuild does for snapshot
files i.e. it just emits a message at LOG level and continues if it is
unable to parse or remove the file. Attaching a small patch applying
the same idea to the mapping files.
This seems reasonable to me. AFAICT moving on to other files after an
error shouldn't cause any problems. In fact, it's probably beneficial
to try to clean up as much as possible so that the files do not
continue to build up.
The only feedback I have for the patch is that I don't think the new
comments are necessary.
Nathan
On Thu, Jan 13, 2022 at 3:47 AM Bossart, Nathan <bossartn@amazon.com> wrote:
On 12/31/21, 4:44 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
Currently the server is erroring out when unable to remove/parse a
logical rewrite file in CheckPointLogicalRewriteHeap wasting the
amount of work the checkpoint has done and preventing the checkpoint
from finishing. This is unlike CheckPointSnapBuild does for snapshot
files i.e. it just emits a message at LOG level and continues if it is
unable to parse or remove the file. Attaching a small patch applying
the same idea to the mapping files.This seems reasonable to me. AFAICT moving on to other files after an
error shouldn't cause any problems. In fact, it's probably beneficial
to try to clean up as much as possible so that the files do not
continue to build up.
Thanks for the review Nathan!
The only feedback I have for the patch is that I don't think the new
comments are necessary.
I borrowed the comments as-is from the CheckPointSnapBuild introduced
by the commit b89e15105. IMO, let the comments be there as they
explain why we are not emitting ERRORs, however I will leave it to the
committer to decide on that.
Regards,
Bharath Rupireddy.
On 1/12/22, 10:03 PM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
On Thu, Jan 13, 2022 at 3:47 AM Bossart, Nathan <bossartn@amazon.com> wrote:
The only feedback I have for the patch is that I don't think the new
comments are necessary.I borrowed the comments as-is from the CheckPointSnapBuild introduced
by the commit b89e15105. IMO, let the comments be there as they
explain why we are not emitting ERRORs, however I will leave it to the
committer to decide on that.
Huh, somehow I missed that when I looked at it yesterday. I'm going
to bump this one to ready-for-committer, then.
Nathan
Hi,
On 2021-12-31 18:12:37 +0530, Bharath Rupireddy wrote:
Currently the server is erroring out when unable to remove/parse a
logical rewrite file in CheckPointLogicalRewriteHeap wasting the
amount of work the checkpoint has done and preventing the checkpoint
from finishing.
This seems like it'd make failures to remove the files practically
invisible. Which'd have it's own set of problems?
What motivated proposing this change?
Greetings,
Andres Freund
On Fri, Jan 14, 2022 at 1:08 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2021-12-31 18:12:37 +0530, Bharath Rupireddy wrote:
Currently the server is erroring out when unable to remove/parse a
logical rewrite file in CheckPointLogicalRewriteHeap wasting the
amount of work the checkpoint has done and preventing the checkpoint
from finishing.This seems like it'd make failures to remove the files practically
invisible. Which'd have it's own set of problems?What motivated proposing this change?
We had an issue where there were many mapping files generated during
the crash recovery and end-of-recovery checkpoint was taking a lot of
time. We had to manually intervene and delete some of the mapping
files (although it may not sound sensible) to make end-of-recovery
checkpoint faster. Because of the race condition between manual
deletion and checkpoint deletion, the unlink error occurred which
crashed the server and the server entered the recovery again wasting
the entire earlier recovery work.
In summary, with the changes (emitting LOG-only messages for unlink
failures and continuing with the other files) proposed for
CheckPointLogicalRewriteHeap in this thread and the existing code in
CheckPointSnapBuild, I'm sure it will help not waste the recovery
that's has been done in case unlink fails for any reasons.
Regards,
Bharath Rupireddy.
Hi,
On Sat, Jan 15, 2022 at 02:04:12PM +0530, Bharath Rupireddy wrote:
We had an issue where there were many mapping files generated during
the crash recovery and end-of-recovery checkpoint was taking a lot of
time. We had to manually intervene and delete some of the mapping
files (although it may not sound sensible) to make end-of-recovery
checkpoint faster. Because of the race condition between manual
deletion and checkpoint deletion, the unlink error occurred which
crashed the server and the server entered the recovery again wasting
the entire earlier recovery work.
Maybe I'm missing something but wouldn't
https://commitfest.postgresql.org/36/3448/ better solve the problem?
On Sat, Jan 15, 2022 at 2:59 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
Hi,
On Sat, Jan 15, 2022 at 02:04:12PM +0530, Bharath Rupireddy wrote:
We had an issue where there were many mapping files generated during
the crash recovery and end-of-recovery checkpoint was taking a lot of
time. We had to manually intervene and delete some of the mapping
files (although it may not sound sensible) to make end-of-recovery
checkpoint faster. Because of the race condition between manual
deletion and checkpoint deletion, the unlink error occurred which
crashed the server and the server entered the recovery again wasting
the entire earlier recovery work.Maybe I'm missing something but wouldn't
https://commitfest.postgresql.org/36/3448/ better solve the problem?
The error can cause the new background process proposed there in that
thread to restart, which is again costly. Since we have LOG-only and
continue behavior in CheckPointSnapBuild already, having the same
behavior for CheckPointLogicalRewriteHeap helps a lot.
Regards,
Bharath Rupireddy.
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
On Sat, Jan 15, 2022 at 2:59 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
Maybe I'm missing something but wouldn't
https://commitfest.postgresql.org/36/3448/ better solve the problem?
The error can cause the new background process proposed there in that
thread to restart, which is again costly. Since we have LOG-only and
continue behavior in CheckPointSnapBuild already, having the same
behavior for CheckPointLogicalRewriteHeap helps a lot.
[ stares at CheckPointLogicalRewriteHeap for awhile ... ]
This code has got more problems than that. It took me awhile to
absorb it, but we don't actually care about the contents of any of
those files; all of the information is encoded in the file *names*.
(This strikes me as probably not a very efficient design, compared
to putting the same data into a single text file; but for now I'll
assume we're not up for a complete rewrite.) That being the case,
I wonder what it is we expect fsync'ing the surviving files to do
exactly. We should be fsync'ing the directory not the files
themselves, no?
Other things that seem poorly thought out:
* Why is the check for "map-" prefix after, rather than before,
the lstat?
* Why is it okay to ignore lstat failure? Seems like we might
as well not even have the lstat.
* The sscanf on the file name would not notice trailing junk,
such as an editor backup marker. Is that okay?
As far as the patch itself goes, I agree that failure to unlink
is noncritical, because such a file would have no further effect
and we can just ignore it. I think I also agree that failure
of the sscanf is noncritical, because the implication of that
is that the file name doesn't conform to our expectations, which
means it's basically just like the check that causes us to
ignore names not starting with "map-". (Actually, isn't the
separate check for "map-" useless, given that sscanf will make
the equivalent check?)
I started out wondering why the patch didn't also change the loop's
other ERROR conditions to LOG. But we do want to ERROR if we're
unable to sync transient state down to disk, and that is what
the other steps (think they) are doing. It might be worth a
comment to point that out though, before someone decides that
if these errors are just LOG level then the others can be too.
Anyway, I think possibly we can drop the bottom half of the loop
(the part trying to fsync non-removed files) in favor of fsync'ing
the directory afterwards. Thoughts?
regards, tom lane
I wrote:
Anyway, I think possibly we can drop the bottom half of the loop
(the part trying to fsync non-removed files) in favor of fsync'ing
the directory afterwards. Thoughts?
Oh, scratch that --- *this* loop doesn't care about the file
contents, but other code does. However, don't we need a directory
fsync too? Or is that handled someplace else?
regards, tom lane
Hi,
On 2022-01-19 13:34:21 -0500, Tom Lane wrote:
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
On Sat, Jan 15, 2022 at 2:59 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
Maybe I'm missing something but wouldn't
https://commitfest.postgresql.org/36/3448/ better solve the problem?The error can cause the new background process proposed there in that
thread to restart, which is again costly. Since we have LOG-only and
continue behavior in CheckPointSnapBuild already, having the same
behavior for CheckPointLogicalRewriteHeap helps a lot.[ stares at CheckPointLogicalRewriteHeap for awhile ... ]
This code has got more problems than that. It took me awhile to
absorb it, but we don't actually care about the contents of any of
those files; all of the information is encoded in the file *names*.
I'm not following - we *do* need the contents of the files? They're applied
as-needed in ApplyLogicalMappingFile().
(This strikes me as probably not a very efficient design, compared
to putting the same data into a single text file; but for now I'll
assume we're not up for a complete rewrite.) That being the case,
I wonder what it is we expect fsync'ing the surviving files to do
exactly. We should be fsync'ing the directory not the files
themselves, no?
Fsyncing the directory doesn't guarantee anything about the contents of
files. But, you're right, we need an fsync of the directory too.
Other things that seem poorly thought out:
* Why is the check for "map-" prefix after, rather than before,
the lstat?
It doesn't seem to matter much - there shouldn't be a meaningful amount of
other files in there.
* Why is it okay to ignore lstat failure? Seems like we might
as well not even have the lstat.
Yea, that seems odd, not sure why that ended up this way. I guess the aim
might have been to tolerate random files we don't have permissions for or
such?
* The sscanf on the file name would not notice trailing junk,
such as an editor backup marker. Is that okay?
I don't really see a problem with it - there shouldn't be other files matching
the pattern - but it couldn't hurt to check the pattern matches exhaustively.
As far as the patch itself goes, I agree that failure to unlink
is noncritical, because such a file would have no further effect
and we can just ignore it.
I don't agree. We iterate through the directory regularly on systems with
catalog changes + logical decoding. An ever increasing list of gunk will make
that more and more expensive. And I haven't heard a meaningful reason why we
would have map-* files that we can't remove.
Ignoring failures like this just makes problems much harder to debug and they
tend to bite harder for it.
I think I also agree that failure of the sscanf is noncritical, because the
implication of that is that the file name doesn't conform to our
expectations, which means it's basically just like the check that causes us
to ignore names not starting with "map-". (Actually, isn't the separate
check for "map-" useless, given that sscanf will make the equivalent check?)
Well, this way only files starting with "map-" are expected to conform to a
strict format, the rest is ignored?
Anyway, I think possibly we can drop the bottom half of the loop
(the part trying to fsync non-removed files) in favor of fsync'ing
the directory afterwards. Thoughts?
I don't think that'd be correct.
In short: We should add a directory fsync, I'm fine with improving the error
checking, but the rest seems like a net-negative with no convincing reasoning.
Greetings,
Andres Freund
On 1/19/22, 11:08 AM, "Andres Freund" <andres@anarazel.de> wrote:
On 2022-01-19 13:34:21 -0500, Tom Lane wrote:
As far as the patch itself goes, I agree that failure to unlink
is noncritical, because such a file would have no further effect
and we can just ignore it.I don't agree. We iterate through the directory regularly on systems with
catalog changes + logical decoding. An ever increasing list of gunk will make
that more and more expensive. And I haven't heard a meaningful reason why we
would have map-* files that we can't remove.
I think the other side of this is that we don't want checkpointing to
continually fail because of a noncritical failure. That could also
lead to problems down the road.
Ignoring failures like this just makes problems much harder to debug and they
tend to bite harder for it.
If such noncritical failures happened regularly, the server logs will
likely become filled with messages about it. Perhaps users may not
notice for a while, but I don't think the proposed patch would make
debugging excessively difficult.
Nathan
"Bossart, Nathan" <bossartn@amazon.com> writes:
I think the other side of this is that we don't want checkpointing to
continually fail because of a noncritical failure. That could also
lead to problems down the road.
Yeah, a persistent failure to complete checkpoints is very nasty.
Your disk will soon fill with unrecyclable WAL. I don't see how
that's better than a somewhat hypothetical performance issue.
regards, tom lane
I took the liberty of adjusting Bharath's patch based on the latest
feedback.
On 1/19/22, 10:35 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
We should be fsync'ing the directory not the files
themselves, no?
I added a directory sync at the end of CheckPointLogicalRewriteHeap(),
which IIUC is enough.
* Why is the check for "map-" prefix after, rather than before,
the lstat?
I swapped these checks. I stopped short of moving the sscanf() before
the lstat(), though, as 1) I don't think it will help very much and 2)
it seemed weird to start emitting "could not parse filename" logs for
non-regular files we presently skip silently.
* Why is it okay to ignore lstat failure? Seems like we might
as well not even have the lstat.
I added error checking for lstat().
* The sscanf on the file name would not notice trailing junk,
such as an editor backup marker. Is that okay?
I think this is okay. The absolute worst that would happen would be
that the extra file would be deleted. This might eventually become a
problem if files with the same prefix format were created by the
server. However, CheckPointSnapBuild() already has this problem with
temporary files, and it claims not to need any extra handling:
* temporary filenames from SnapBuildSerialize() include the LSN and
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
(Actually, isn't the
separate check for "map-" useless, given that sscanf will make
the equivalent check?)
The only benefit I see from the extra "map-" check is that it'll avoid
"could not parse filename" logs for files that clearly aren't related
to the task at hand. I don't know if this is expected during normal
operation at the moment. I've left the "map-" check for now.
I started out wondering why the patch didn't also change the loop's
other ERROR conditions to LOG. But we do want to ERROR if we're
unable to sync transient state down to disk, and that is what
the other steps (think they) are doing. It might be worth a
comment to point that out though, before someone decides that
if these errors are just LOG level then the others can be too.
I added such a comment.
I also updated CheckPointSnapBuild() and UpdateLogicalMappings() with
similar adjustments.
Nathan
Attachments:
v2-0001-fix-up-code-in-directory-scans-for-logical-replic.patchapplication/octet-stream; name=v2-0001-fix-up-code-in-directory-scans-for-logical-replic.patchDownload
From 4148b0756058ad42367b9758284af1cfefbd023e Mon Sep 17 00:00:00 2001
From: Nathan Bossart <bossartn@amazon.com>
Date: Thu, 20 Jan 2022 19:03:03 +0000
Subject: [PATCH v2 1/1] fix up code in directory scans for logical replication
---
src/backend/access/heap/rewriteheap.c | 44 ++++++++++++++++++++-----
src/backend/replication/logical/reorderbuffer.c | 4 ---
src/backend/replication/logical/snapbuild.c | 9 +++--
3 files changed, 41 insertions(+), 16 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index aa265edf60..f1df361e06 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1222,31 +1222,49 @@ CheckPointLogicalRewriteHeap(void)
uint32 hi,
lo;
- if (strcmp(mapping_de->d_name, ".") == 0 ||
- strcmp(mapping_de->d_name, "..") == 0)
+ /* Skip over files that cannot be ours. */
+ if (strncmp(mapping_de->d_name, "map-", 4) != 0)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- continue;
-
- /* Skip over files that cannot be ours. */
- if (strncmp(mapping_de->d_name, "map-", 4) != 0)
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISREG(statbuf.st_mode))
continue;
+ /*
+ * We just log a message if a file doesn't fit the pattern, it's
+ * probably some editor's lock/state file or similar...
+ */
if (sscanf(mapping_de->d_name, LOGICAL_REWRITE_FORMAT,
&dboid, &relid, &hi, &lo, &rewrite_xid, &create_xid) != 6)
- elog(ERROR, "could not parse filename \"%s\"", mapping_de->d_name);
+ {
+ ereport(LOG,
+ (errmsg("could not parse file name \"%s\"",
+ mapping_de->d_name)));
+ continue;
+ }
lsn = ((uint64) hi) << 32 | lo;
if (lsn < cutoff || cutoff == InvalidXLogRecPtr)
{
elog(DEBUG1, "removing logical rewrite file \"%s\"", path);
+
+ /*
+ * It's not particularly harmful, though strange, if we can't
+ * remove the file here. Don't prevent the checkpoint from
+ * completing, that'd be a cure worse than the disease.
+ */
if (unlink(path) < 0)
- ereport(ERROR,
+ {
+ ereport(LOG,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m", path)));
+ continue;
+ }
}
else
{
@@ -1267,6 +1285,11 @@ CheckPointLogicalRewriteHeap(void)
* We could try to avoid fsyncing files that either haven't
* changed or have only been created since the checkpoint's start,
* but it's currently not deemed worth the effort.
+ *
+ * Unlike failures to unlink() old logical rewrite files, we must
+ * ERROR if we're unable to sync transient state down to disk. This
+ * allows replay to assume that everything written out before
+ * checkpoint start is persisted.
*/
pgstat_report_wait_start(WAIT_EVENT_LOGICAL_REWRITE_CHECKPOINT_SYNC);
if (pg_fsync(fd) != 0)
@@ -1282,4 +1305,7 @@ CheckPointLogicalRewriteHeap(void)
}
}
FreeDir(mappings_dir);
+
+ /* persist directory entries to disk */
+ fsync_fname("pg_logical/mappings", true);
}
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index d317a626e1..6fb4b35d89 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -5028,10 +5028,6 @@ UpdateLogicalMappings(HTAB *tuplecid_data, Oid relid, Snapshot snapshot)
f_lo;
RewriteMappingFile *f;
- if (strcmp(mapping_de->d_name, ".") == 0 ||
- strcmp(mapping_de->d_name, "..") == 0)
- continue;
-
/* Ignore files that aren't ours */
if (strncmp(mapping_de->d_name, "map-", 4) != 0)
continue;
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index ab04e946bc..f9e26e96c7 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1954,8 +1954,11 @@ CheckPointSnapBuild(void)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISREG(statbuf.st_mode))
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
@@ -1968,7 +1971,7 @@ CheckPointSnapBuild(void)
* currently being written that are older than cutoff.
*
* We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
+ * probably some editor's lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
{
--
2.16.6
Hi,
On 2022-01-20 19:15:08 +0000, Bossart, Nathan wrote:
* Why is it okay to ignore lstat failure? Seems like we might
as well not even have the lstat.I added error checking for lstat().
It seems odd to change a bunch of things to not be errors anymore, but then
add new sources of errors. If we try to deal with concurrent deletions or
permission issues - otherwise what's the point of making unlink() not an error
anymore - why do we expect to be able to lstat()?
I also updated CheckPointSnapBuild() and UpdateLogicalMappings() with
similar adjustments.
FWIW, I still think the ERROR->LOG changes are bad idea. The whole thing of
"oh, let's just ignore stuff that we don't expect and soldier on" has bitten
us over and over again. It makes us less robust, not more robust.
It's also just about impossible to monitor for problems that emit LOG.
I'd be more on board accepting some selective errors. E.g. not erroring on
ENOENT, but continuing to error on others (most likely ENOACCESS). I think we
should *not* change the
+ /* + * We just log a message if a file doesn't fit the pattern, it's + * probably some editor's lock/state file or similar... + */
An editor's lock file that starts with map- would presumably be the whole
filename plus an additional file-ending. But this check won't catch those.
+ * Unlike failures to unlink() old logical rewrite files, we must + * ERROR if we're unable to sync transient state down to disk. This + * allows replay to assume that everything written out before + * checkpoint start is persisted. */
Hm, not quite happy with the second bit. Checkpointed state being durable
isn't about replay directly, it's about the basic property of a checkpoint
being fulfilled?
pgstat_report_wait_start(WAIT_EVENT_LOGICAL_REWRITE_CHECKPOINT_SYNC); if (pg_fsync(fd) != 0) @@ -1282,4 +1305,7 @@ CheckPointLogicalRewriteHeap(void) } } FreeDir(mappings_dir); + + /* persist directory entries to disk */ + fsync_fname("pg_logical/mappings", true); }
This is probably worth backpatching, if you split it out I'll do so.
Greetings,
Andres Freund
Thanks for your feedback.
On 1/20/22, 11:47 AM, "Andres Freund" <andres@anarazel.de> wrote:
It seems odd to change a bunch of things to not be errors anymore, but then
add new sources of errors. If we try to deal with concurrent deletions or
permission issues - otherwise what's the point of making unlink() not an error
anymore - why do we expect to be able to lstat()?
My reasoning for making lstat() an ERROR was that there's a chance we
need to fsync() the file, and if we can't fsync() a file for whatever
reason, we definitely want to ERROR. I suppose we could conditionally
ERROR based on the file name, but I don't know if that's really worth
the complexity.
I'd be more on board accepting some selective errors. E.g. not erroring on
ENOENT, but continuing to error on others (most likely ENOACCESS). I think we
should *not* change the
I think this approach would work for the use-case Bharath mentioned
upthread. In any case, if deleting a file fails because the file was
already deleted, there's no point in ERROR-ing. I think filtering
errors is a bit trickier for lstat(). If we would've fsync'd the file
but lstat() gives us ENOENT, we may have a problem. (However, there's
also a good chance we wouldn't notice such problems if the race didn't
occur.) I'll play around with it.
+ /* + * We just log a message if a file doesn't fit the pattern, it's + * probably some editor's lock/state file or similar... + */An editor's lock file that starts with map- would presumably be the whole
filename plus an additional file-ending. But this check won't catch those.
Right, it will either fsync() or unlink() those. I'll work on the
comment. Or do you think it's worth validating that there are no
trailing characters? I looked into that a bit earlier, and the code
felt excessive to me, but I don't have a strong opinion here.
+ * Unlike failures to unlink() old logical rewrite files, we must + * ERROR if we're unable to sync transient state down to disk. This + * allows replay to assume that everything written out before + * checkpoint start is persisted. */Hm, not quite happy with the second bit. Checkpointed state being durable
isn't about replay directly, it's about the basic property of a checkpoint
being fulfilled?
I'll work on this. FWIW I modeled this based on the comment above the
function. Do you think that should be adjusted as well?
+ /* persist directory entries to disk */ + fsync_fname("pg_logical/mappings", true);This is probably worth backpatching, if you split it out I'll do so.
Sure thing, will do shortly.
Nathan
On 1/20/22, 12:24 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
On 1/20/22, 11:47 AM, "Andres Freund" <andres@anarazel.de> wrote:
+ /* persist directory entries to disk */ + fsync_fname("pg_logical/mappings", true);This is probably worth backpatching, if you split it out I'll do so.
Sure thing, will do shortly.
Here's this part.
Nathan
Attachments:
v3-0001-persist-directory-entries-for-logical-rewrite-fil.patchapplication/octet-stream; name=v3-0001-persist-directory-entries-for-logical-rewrite-fil.patchDownload
From dc9dcd3e8e8b648ef410c1b01d63a6c7bd41f94b Mon Sep 17 00:00:00 2001
From: Nathan Bossart <bossartn@amazon.com>
Date: Thu, 20 Jan 2022 20:32:39 +0000
Subject: [PATCH v3 1/1] persist directory entries for logical rewrite files to
disk during checkpoints
---
src/backend/access/heap/rewriteheap.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index aa265edf60..2a53826736 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1282,4 +1282,7 @@ CheckPointLogicalRewriteHeap(void)
}
}
FreeDir(mappings_dir);
+
+ /* persist directory entries to disk */
+ fsync_fname("pg_logical/mappings", true);
}
--
2.16.6
On 2022-01-20 20:41:16 +0000, Bossart, Nathan wrote:
Here's this part.
And pushed to all branches. Thanks.
On Fri, Jan 21, 2022 at 11:49:56AM -0800, Andres Freund wrote:
On 2022-01-20 20:41:16 +0000, Bossart, Nathan wrote:
Here's this part.
And pushed to all branches. Thanks.
Thanks!
I spent some time thinking about the right way to proceed here, and I came
up with the attached patches. The first patch just adds error checking for
various lstat() calls in the replication code. If lstat() fails, then it
probably doesn't make sense to try to continue processing the file.
The second patch changes some nearby calls to ereport() to ERROR. If these
failures are truly unexpected, and we don't intend to support use-cases
like concurrent manual deletion, then failing might be the right way to go.
I think it's a shame that such failures could cause checkpointing to
continually fail, but that topic is already being discussed elsewhere [0]/messages/by-id/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB@amazon.com.
[0]: /messages/by-id/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB@amazon.com
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com/
Attachments:
v4-0001-add-error-checking-for-calls-to-lstat-in-replicat.patchtext/x-diff; charset=us-asciiDownload
From df7a07d27c2858ec3ac383335af3e55e5428e3b1 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Wed, 26 Jan 2022 14:03:11 -0800
Subject: [PATCH v4 1/2] add error checking for calls to lstat() in replication
code
---
src/backend/access/heap/rewriteheap.c | 6 +++++-
src/backend/replication/logical/reorderbuffer.c | 6 +++++-
src/backend/replication/logical/snapbuild.c | 6 +++++-
src/backend/replication/slot.c | 6 +++++-
4 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..81319e0c78 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1227,7 +1227,11 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISREG(statbuf.st_mode))
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 19b2ba2000..d8d784a42f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4407,7 +4407,11 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
sprintf(path, "pg_replslot/%s", slotname);
/* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISDIR(statbuf.st_mode))
return;
spill_dir = AllocateDir(path);
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..57f5a5e81f 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1955,7 +1955,11 @@ CheckPointSnapBuild(void)
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISREG(statbuf.st_mode))
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index e5e0cf8768..21fb7536e2 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1413,7 +1413,11 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISDIR(statbuf.st_mode))
continue;
/* we crashed while a slot was being setup or deleted, clean up */
--
2.25.1
v4-0002-minor-improvements-to-replication-code.patchtext/x-diff; charset=us-asciiDownload
From 28a06a2cb84102a92da3fca0f0055dd67902d7de Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Wed, 26 Jan 2022 16:24:04 -0800
Subject: [PATCH v4 2/2] minor improvements to replication code
---
src/backend/replication/logical/snapbuild.c | 19 ++-----------------
src/backend/replication/slot.c | 5 +----
2 files changed, 3 insertions(+), 21 deletions(-)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 57f5a5e81f..ce6cb85c1a 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1970,16 +1970,10 @@ CheckPointSnapBuild(void)
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
- *
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
- {
- ereport(LOG,
+ ereport(ERROR,
(errmsg("could not parse file name \"%s\"", path)));
- continue;
- }
lsn = ((uint64) hi) << 32 | lo;
@@ -1987,20 +1981,11 @@ CheckPointSnapBuild(void)
if (lsn < cutoff || cutoff == InvalidXLogRecPtr)
{
elog(DEBUG1, "removing snapbuild snapshot %s", path);
-
- /*
- * It's not particularly harmful, though strange, if we can't
- * remove the file here. Don't prevent the checkpoint from
- * completing, that'd be a cure worse than the disease.
- */
if (unlink(path) < 0)
- {
- ereport(LOG,
+ ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",
path)));
- continue;
- }
}
}
FreeDir(snap_dir);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 21fb7536e2..37ec3fa633 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1424,12 +1424,9 @@ StartupReplicationSlots(void)
if (pg_str_endswith(replication_de->d_name, ".tmp"))
{
if (!rmtree(path, true))
- {
- ereport(WARNING,
+ ereport(ERROR,
(errmsg("could not remove directory \"%s\"",
path)));
- continue;
- }
fsync_fname("pg_replslot", true);
continue;
}
--
2.25.1
On Thu, Jan 27, 2022 at 6:31 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
I spent some time thinking about the right way to proceed here, and I came
up with the attached patches. The first patch just adds error checking for
various lstat() calls in the replication code. If lstat() fails, then it
probably doesn't make sense to try to continue processing the file.The second patch changes some nearby calls to ereport() to ERROR. If these
failures are truly unexpected, and we don't intend to support use-cases
like concurrent manual deletion, then failing might be the right way to go.
I think it's a shame that such failures could cause checkpointing to
continually fail, but that topic is already being discussed elsewhere [0].[0] /messages/by-id/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB@amazon.com
After an off-list discussion with Andreas, proposing here a patch that
basically replaces ReadDir call with ReadDirExtended and gets rid of
lstat entirely. With this chance, the checkpoint will only care about
the snapshot and mapping files and not fail if it finds other files in
the directories. Removing lstat enables us to make things faster as we
avoid a bunch of extra system calls - one lstat call per each mapping
or snapshot file.
This patch also includes, converting "could not parse filename" and
"could not remove file" errors to LOG messages in
CheckPointLogicalRewriteHeap. This will enable checkpoint not to waste
the amount of work that it has done in case it is unable to
parse/remove the file, for whatever reasons.
Regards,
Bharath Rupireddy.
Attachments:
v5-0001-Replace-ReadDir-with-ReadDirExtended.patchapplication/x-patch; name=v5-0001-Replace-ReadDir-with-ReadDirExtended.patchDownload
From b5358ad720caa4af5fe8635920ab5853afd5a327 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 31 Jan 2022 05:05:22 +0000
Subject: [PATCH v5] Replace ReadDir with ReadDirExtended
Replace ReadDir with ReadDirExtended and get rid of lstat entirely.
With this change, the checkpoint will only care about the snapshot
and mapping files and not fail if it finds other files in the
directories.
Removing lstat enables us to make things faster as we avoid extra
system calls.
Also, convert "could not parse filename" and "could not remove file"
errors to LOG messages in CheckPointLogicalRewriteHeap. This will
enable checkpoint not to waste the amount of work that it had done.
---
src/backend/access/heap/rewriteheap.c | 25 ++++++++++++++++-----
src/backend/replication/logical/snapbuild.c | 9 +-------
2 files changed, 20 insertions(+), 14 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..2cb72a2a5c 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1211,9 +1211,8 @@ CheckPointLogicalRewriteHeap(void)
cutoff = redo;
mappings_dir = AllocateDir("pg_logical/mappings");
- while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
+ while ((mapping_de = ReadDirExtended(mappings_dir, "pg_logical/mappings", LOG)) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,26 +1226,40 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- continue;
/* Skip over files that cannot be ours. */
if (strncmp(mapping_de->d_name, "map-", 4) != 0)
continue;
+ /*
+ * We just log a message if a file doesn't fit the pattern, it's
+ * probably some editors lock/state file or similar...
+ */
if (sscanf(mapping_de->d_name, LOGICAL_REWRITE_FORMAT,
&dboid, &relid, &hi, &lo, &rewrite_xid, &create_xid) != 6)
- elog(ERROR, "could not parse filename \"%s\"", mapping_de->d_name);
+ {
+ elog(LOG, "could not parse filename \"%s\"", mapping_de->d_name);
+ continue;
+ }
lsn = ((uint64) hi) << 32 | lo;
if (lsn < cutoff || cutoff == InvalidXLogRecPtr)
{
elog(DEBUG1, "removing logical rewrite file \"%s\"", path);
+
+ /*
+ * It's not particularly harmful, though strange, if we can't
+ * remove the file here. Don't prevent the checkpoint from
+ * completing, that'd be a cure worse than the disease.
+ */
if (unlink(path) < 0)
- ereport(ERROR,
+ {
+ ereport(LOG,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m", path)));
+ continue;
+ }
}
else
{
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..a4bf7a7fd9 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1942,12 +1942,11 @@ CheckPointSnapBuild(void)
cutoff = redo;
snap_dir = AllocateDir("pg_logical/snapshots");
- while ((snap_de = ReadDir(snap_dir, "pg_logical/snapshots")) != NULL)
+ while ((snap_de = ReadDirExtended(snap_dir, "pg_logical/snapshots", LOG)) != NULL)
{
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
@@ -1955,12 +1954,6 @@ CheckPointSnapBuild(void)
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- {
- elog(DEBUG1, "only regular files expected: %s", path);
- continue;
- }
-
/*
* temporary filenames from SnapBuildSerialize() include the LSN and
* everything but are postfixed by .$pid.tmp. We can just remove them
--
2.25.1
On Mon, Jan 31, 2022 at 10:42:54AM +0530, Bharath Rupireddy wrote:
After an off-list discussion with Andreas, proposing here a patch that
basically replaces ReadDir call with ReadDirExtended and gets rid of
lstat entirely. With this chance, the checkpoint will only care about
the snapshot and mapping files and not fail if it finds other files in
the directories. Removing lstat enables us to make things faster as we
avoid a bunch of extra system calls - one lstat call per each mapping
or snapshot file.
I think removing the lstat() is probably reasonable. We currently aren't
doing proper error checking, and the chances of a non-regular file matching
the prefix are likely pretty low. In the worst case, we'll LOG or ERROR
when unlinking or fsyncing fails.
However, I'm not sure about the change to ReadDirExtended(). That might be
okay for CheckPointSnapBuild(), which is just trying to remove old files,
but CheckPointLogicalRewriteHeap() is responsible for ensuring that files
are flushed to disk for the checkpoint. If we stop reading the directory
after an error and let the checkpoint continue, isn't it possible that some
mappings files won't be persisted to disk?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Feb 2, 2022 at 5:25 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Jan 31, 2022 at 10:42:54AM +0530, Bharath Rupireddy wrote:
After an off-list discussion with Andreas, proposing here a patch that
basically replaces ReadDir call with ReadDirExtended and gets rid of
lstat entirely. With this chance, the checkpoint will only care about
the snapshot and mapping files and not fail if it finds other files in
the directories. Removing lstat enables us to make things faster as we
avoid a bunch of extra system calls - one lstat call per each mapping
or snapshot file.I think removing the lstat() is probably reasonable. We currently aren't
doing proper error checking, and the chances of a non-regular file matching
the prefix are likely pretty low. In the worst case, we'll LOG or ERROR
when unlinking or fsyncing fails.However, I'm not sure about the change to ReadDirExtended(). That might be
okay for CheckPointSnapBuild(), which is just trying to remove old files,
but CheckPointLogicalRewriteHeap() is responsible for ensuring that files
are flushed to disk for the checkpoint. If we stop reading the directory
after an error and let the checkpoint continue, isn't it possible that some
mappings files won't be persisted to disk?
Unless I mis-read your above statement, with LOG level in
ReadDirExtended, I don't think we stop reading the files in
CheckPointLogicalRewriteHeap. Am I missing something here?
Since, we also continue in CheckPointLogicalRewriteHeap if we can't
parse/delete some files with the change of "could not parse
filename"/"could not remove file" messages to LOG level
I'm attaching v6, just changed elog(LOG, to ereport(LOG in
CheckPointLogicalRewriteHeap, other things remain the same.
Regards,
Bharath Rupireddy.
Attachments:
v6-0001-Replace-ReadDir-with-ReadDirExtended.patchapplication/octet-stream; name=v6-0001-Replace-ReadDir-with-ReadDirExtended.patchDownload
From 0716e03efb41a7d99872889cafbed802e326e92a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 2 Feb 2022 11:47:10 +0000
Subject: [PATCH v6] Replace ReadDir with ReadDirExtended
Replace ReadDir with ReadDirExtended and get rid of lstat entirely.
With this change, the checkpoint will only care about the snapshot
and mapping files and not fail if it finds other files in the
directories.
Removing lstat enables us to make things faster as we avoid extra
system calls.
Also, convert "could not parse filename" and "could not remove file"
errors to LOG messages in CheckPointLogicalRewriteHeap. This will
enable checkpoint not to waste the amount of work that it had done.
---
src/backend/access/heap/rewriteheap.c | 29 ++++++++++++++++-----
src/backend/replication/logical/snapbuild.c | 9 +------
2 files changed, 23 insertions(+), 15 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..b028b98eaa 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1211,9 +1211,8 @@ CheckPointLogicalRewriteHeap(void)
cutoff = redo;
mappings_dir = AllocateDir("pg_logical/mappings");
- while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
+ while ((mapping_de = ReadDirExtended(mappings_dir, "pg_logical/mappings", LOG)) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,26 +1226,42 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- continue;
/* Skip over files that cannot be ours. */
if (strncmp(mapping_de->d_name, "map-", 4) != 0)
continue;
+ /*
+ * We just log a message if a file doesn't fit the pattern, it's
+ * probably some editors lock/state file or similar...
+ */
if (sscanf(mapping_de->d_name, LOGICAL_REWRITE_FORMAT,
&dboid, &relid, &hi, &lo, &rewrite_xid, &create_xid) != 6)
- elog(ERROR, "could not parse filename \"%s\"", mapping_de->d_name);
+ {
+ ereport(LOG,
+ (errmsg("could not parse file name \"%s\"", path)));
+ continue;
+ }
lsn = ((uint64) hi) << 32 | lo;
if (lsn < cutoff || cutoff == InvalidXLogRecPtr)
{
elog(DEBUG1, "removing logical rewrite file \"%s\"", path);
+
+ /*
+ * It's not particularly harmful, though strange, if we can't
+ * remove the file here. Don't prevent the checkpoint from
+ * completing, that'd be a cure worse than the disease.
+ */
if (unlink(path) < 0)
- ereport(ERROR,
+ {
+ ereport(LOG,
(errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
+ errmsg("could not remove file \"%s\": %m",
+ path)));
+ continue;
+ }
}
else
{
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..a4bf7a7fd9 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1942,12 +1942,11 @@ CheckPointSnapBuild(void)
cutoff = redo;
snap_dir = AllocateDir("pg_logical/snapshots");
- while ((snap_de = ReadDir(snap_dir, "pg_logical/snapshots")) != NULL)
+ while ((snap_de = ReadDirExtended(snap_dir, "pg_logical/snapshots", LOG)) != NULL)
{
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
@@ -1955,12 +1954,6 @@ CheckPointSnapBuild(void)
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- {
- elog(DEBUG1, "only regular files expected: %s", path);
- continue;
- }
-
/*
* temporary filenames from SnapBuildSerialize() include the LSN and
* everything but are postfixed by .$pid.tmp. We can just remove them
--
2.25.1
On Wed, Feb 02, 2022 at 05:19:26PM +0530, Bharath Rupireddy wrote:
On Wed, Feb 2, 2022 at 5:25 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
However, I'm not sure about the change to ReadDirExtended(). That might be
okay for CheckPointSnapBuild(), which is just trying to remove old files,
but CheckPointLogicalRewriteHeap() is responsible for ensuring that files
are flushed to disk for the checkpoint. If we stop reading the directory
after an error and let the checkpoint continue, isn't it possible that some
mappings files won't be persisted to disk?Unless I mis-read your above statement, with LOG level in
ReadDirExtended, I don't think we stop reading the files in
CheckPointLogicalRewriteHeap. Am I missing something here?
ReadDirExtended() has the following comment:
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
If there is a problem reading the directory, we will LOG and then exit the
loop. If we didn't scan through all the entries in the directory, there is
a chance that we didn't fsync() all the files that need it.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Thu, Feb 3, 2022 at 12:07 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Feb 02, 2022 at 05:19:26PM +0530, Bharath Rupireddy wrote:
On Wed, Feb 2, 2022 at 5:25 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
However, I'm not sure about the change to ReadDirExtended(). That might be
okay for CheckPointSnapBuild(), which is just trying to remove old files,
but CheckPointLogicalRewriteHeap() is responsible for ensuring that files
are flushed to disk for the checkpoint. If we stop reading the directory
after an error and let the checkpoint continue, isn't it possible that some
mappings files won't be persisted to disk?Unless I mis-read your above statement, with LOG level in
ReadDirExtended, I don't think we stop reading the files in
CheckPointLogicalRewriteHeap. Am I missing something here?ReadDirExtended() has the following comment:
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.If there is a problem reading the directory, we will LOG and then exit the
loop. If we didn't scan through all the entries in the directory, there is
a chance that we didn't fsync() all the files that need it.
Thanks. I get it. For syncing map files, we don't want to tolerate any
errors, whereas removal of the old map files (lesser than cutoff LSN)
can be tolerated in CheckPointLogicalRewriteHeap.
Here's the v7 version using ReadDir for CheckPointLogicalRewriteHeap.
Regards,
Bharath Rupireddy.
Attachments:
v7-0001-Replace-ReadDir-with-ReadDirExtended.patchapplication/octet-stream; name=v7-0001-Replace-ReadDir-with-ReadDirExtended.patchDownload
From aef9d67db637d9e9bd7e9b7c2381e952a891e141 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 3 Feb 2022 04:13:00 +0000
Subject: [PATCH v7] Replace ReadDir with ReadDirExtended
Replace ReadDir with ReadDirExtended (in CheckPointSnapBuild) and
get rid of lstat entirely. We still use ReadDir in CheckPointLogicalRewriteHeap
because unable to read directory would result a NULL from
ReadDirExtended and we may miss to fsync the remaining map files,
so here let's error out with ReadDir.
With this change, the checkpoint will only care about the snapshot
and mapping files and not fail if it finds other files in the
directories.
Removing lstat enables us to make things faster as we avoid extra
system calls.
Also, convert "could not parse filename" and "could not remove file"
errors to LOG messages in CheckPointLogicalRewriteHeap. This will
enable checkpoint not to waste the amount of work that it had done.
---
src/backend/access/heap/rewriteheap.c | 27 ++++++++++++++++-----
src/backend/replication/logical/snapbuild.c | 9 +------
2 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..24e7a35881 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1213,7 +1213,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,26 +1226,42 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- continue;
/* Skip over files that cannot be ours. */
if (strncmp(mapping_de->d_name, "map-", 4) != 0)
continue;
+ /*
+ * We just log a message if a file doesn't fit the pattern, it's
+ * probably some editors lock/state file or similar...
+ */
if (sscanf(mapping_de->d_name, LOGICAL_REWRITE_FORMAT,
&dboid, &relid, &hi, &lo, &rewrite_xid, &create_xid) != 6)
- elog(ERROR, "could not parse filename \"%s\"", mapping_de->d_name);
+ {
+ ereport(LOG,
+ (errmsg("could not parse file name \"%s\"", path)));
+ continue;
+ }
lsn = ((uint64) hi) << 32 | lo;
if (lsn < cutoff || cutoff == InvalidXLogRecPtr)
{
elog(DEBUG1, "removing logical rewrite file \"%s\"", path);
+
+ /*
+ * It's not particularly harmful, though strange, if we can't
+ * remove the file here. Don't prevent the checkpoint from
+ * completing, that'd be a cure worse than the disease.
+ */
if (unlink(path) < 0)
- ereport(ERROR,
+ {
+ ereport(LOG,
(errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
+ errmsg("could not remove file \"%s\": %m",
+ path)));
+ continue;
+ }
}
else
{
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..a4bf7a7fd9 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1942,12 +1942,11 @@ CheckPointSnapBuild(void)
cutoff = redo;
snap_dir = AllocateDir("pg_logical/snapshots");
- while ((snap_de = ReadDir(snap_dir, "pg_logical/snapshots")) != NULL)
+ while ((snap_de = ReadDirExtended(snap_dir, "pg_logical/snapshots", LOG)) != NULL)
{
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
@@ -1955,12 +1954,6 @@ CheckPointSnapBuild(void)
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- {
- elog(DEBUG1, "only regular files expected: %s", path);
- continue;
- }
-
/*
* temporary filenames from SnapBuildSerialize() include the LSN and
* everything but are postfixed by .$pid.tmp. We can just remove them
--
2.25.1
On Thu, Feb 03, 2022 at 09:45:08AM +0530, Bharath Rupireddy wrote:
On Thu, Feb 3, 2022 at 12:07 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
If there is a problem reading the directory, we will LOG and then exit the
loop. If we didn't scan through all the entries in the directory, there is
a chance that we didn't fsync() all the files that need it.Thanks. I get it. For syncing map files, we don't want to tolerate any
errors, whereas removal of the old map files (lesser than cutoff LSN)
can be tolerated in CheckPointLogicalRewriteHeap.
LGTM. Andres noted upthread [0]/messages/by-id/20220120194618.hmfd4kxkng2cgryh@alap3.anarazel.de that the comment above sscanf() about
skipping editors' lock files might not be accurate. I don't think it's a
huge problem if sscanf() matches those files, but perhaps we can improve
the comment.
[0]: /messages/by-id/20220120194618.hmfd4kxkng2cgryh@alap3.anarazel.de
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Fri, Feb 4, 2022 at 5:33 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
Thanks. I get it. For syncing map files, we don't want to tolerate any
errors, whereas removal of the old map files (lesser than cutoff LSN)
can be tolerated in CheckPointLogicalRewriteHeap.LGTM. Andres noted upthread [0] that the comment above sscanf() about
skipping editors' lock files might not be accurate. I don't think it's a
huge problem if sscanf() matches those files, but perhaps we can improve
the comment.[0] /messages/by-id/20220120194618.hmfd4kxkng2cgryh@alap3.anarazel.de
Andres comment from [0]:
An editor's lock file that starts with map- would presumably be the whole
filename plus an additional file-ending. But this check won't catch those.
Agreed. sscanf checks can't detect the files named "whole filename
plus an additional file-ending". I just checked with vi editor lock
state file .0-14ED3B8.snap.swp [1]-rw------- 1 bharath bharath 12288 Feb 10 15:48 .0-14ED3B8.snap.swp -rw------- 1 bharath bharath 128 Feb 10 15:48 0-14ED518.snap -rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED518.snap.lockfile -rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED550.snap -rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED600.snap, the log generated is [2]2022-02-10 15:48:47.938 UTC [1121678] LOG: could not parse file name "pg_logical/snapshots/.0-14ED3B8.snap.swp". I'm not
sure exactly which editor would create a lockfile like "whole filename
plus an additional file-ending".
In any case, let's remove the editor's lock/state file from those
comments and have just only "We just log a message if a file doesn't
fit the pattern". Attached v8 patch with that change.
[1]: -rw------- 1 bharath bharath 12288 Feb 10 15:48 .0-14ED3B8.snap.swp -rw------- 1 bharath bharath 128 Feb 10 15:48 0-14ED518.snap -rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED518.snap.lockfile -rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED550.snap -rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED600.snap
-rw------- 1 bharath bharath 12288 Feb 10 15:48 .0-14ED3B8.snap.swp
-rw------- 1 bharath bharath 128 Feb 10 15:48 0-14ED518.snap
-rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED518.snap.lockfile
-rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED550.snap
-rw------- 1 bharath bharath 128 Feb 10 15:49 0-14ED600.snap
[2]: 2022-02-10 15:48:47.938 UTC [1121678] LOG: could not parse file name "pg_logical/snapshots/.0-14ED3B8.snap.swp"
2022-02-10 15:48:47.938 UTC [1121678] LOG: could not parse file name
"pg_logical/snapshots/.0-14ED3B8.snap.swp"
Regards,
Bharath Rupireddy.
Attachments:
v8-0001-Replace-ReadDir-with-ReadDirExtended.patchapplication/octet-stream; name=v8-0001-Replace-ReadDir-with-ReadDirExtended.patchDownload
From 4801ff2c3b1e7bc7076205b676d4e3bc4a4ed308 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 10 Feb 2022 15:58:58 +0000
Subject: [PATCH v8] Replace ReadDir with ReadDirExtended
Replace ReadDir with ReadDirExtended (in CheckPointSnapBuild) and
get rid of lstat entirely. We still use ReadDir in CheckPointLogicalRewriteHeap
because unable to read directory would result a NULL from
ReadDirExtended and we may miss to fsync the remaining map files,
so here let's error out with ReadDir.
With this change, the checkpoint will only care about the snapshot
and mapping files and not fail if it finds other files in the
directories.
Removing lstat enables us to make things faster as we avoid extra
system calls.
Also, convert "could not parse filename" and "could not remove file"
errors to LOG messages in CheckPointLogicalRewriteHeap. This will
enable checkpoint not to waste the amount of work that it had done.
---
src/backend/access/heap/rewriteheap.c | 24 +++++++++++++++------
src/backend/replication/logical/snapbuild.c | 12 ++---------
2 files changed, 20 insertions(+), 16 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..035cd6db70 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -1213,7 +1213,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,26 +1226,39 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- continue;
/* Skip over files that cannot be ours. */
if (strncmp(mapping_de->d_name, "map-", 4) != 0)
continue;
+ /* We just log a message if a file doesn't fit the pattern. */
if (sscanf(mapping_de->d_name, LOGICAL_REWRITE_FORMAT,
&dboid, &relid, &hi, &lo, &rewrite_xid, &create_xid) != 6)
- elog(ERROR, "could not parse filename \"%s\"", mapping_de->d_name);
+ {
+ ereport(LOG,
+ (errmsg("could not parse file name \"%s\"", path)));
+ continue;
+ }
lsn = ((uint64) hi) << 32 | lo;
if (lsn < cutoff || cutoff == InvalidXLogRecPtr)
{
elog(DEBUG1, "removing logical rewrite file \"%s\"", path);
+
+ /*
+ * It's not particularly harmful, though strange, if we can't
+ * remove the file here. Don't prevent the checkpoint from
+ * completing, that'd be a cure worse than the disease.
+ */
if (unlink(path) < 0)
- ereport(ERROR,
+ {
+ ereport(LOG,
(errcode_for_file_access(),
- errmsg("could not remove file \"%s\": %m", path)));
+ errmsg("could not remove file \"%s\": %m",
+ path)));
+ continue;
+ }
}
else
{
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..e2cdf17bee 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1942,12 +1942,11 @@ CheckPointSnapBuild(void)
cutoff = redo;
snap_dir = AllocateDir("pg_logical/snapshots");
- while ((snap_de = ReadDir(snap_dir, "pg_logical/snapshots")) != NULL)
+ while ((snap_de = ReadDirExtended(snap_dir, "pg_logical/snapshots", LOG)) != NULL)
{
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
@@ -1955,20 +1954,13 @@ CheckPointSnapBuild(void)
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
- {
- elog(DEBUG1, "only regular files expected: %s", path);
- continue;
- }
-
/*
* temporary filenames from SnapBuildSerialize() include the LSN and
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
*
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
+ * We just log a message if a file doesn't fit the pattern.
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
{
--
2.25.1
On Thu, Feb 10, 2022 at 09:30:45PM +0530, Bharath Rupireddy wrote:
In any case, let's remove the editor's lock/state file from those
comments and have just only "We just log a message if a file doesn't
fit the pattern". Attached v8 patch with that change.
I've moved this one to ready-for-committer. I was under the impression
that Andres was firmly against this approach, but you did mention there was
an off-list discussion.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2022-02-10 21:30:45 +0530, Bharath Rupireddy wrote:
From 4801ff2c3b1e7bc7076205b676d4e3bc4a4ed308 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 10 Feb 2022 15:58:58 +0000
Subject: [PATCH v8] Replace ReadDir with ReadDirExtendedReplace ReadDir with ReadDirExtended (in CheckPointSnapBuild) and
get rid of lstat entirely.
I think this might be based on a slight misunderstanding / bad phrasing on my
part. We can use get_dirent_type() to optimize away the lstat on most
platforms, ReadDirExtended itself doesn't do that automatically. I was trying
to reference removing lstat calls by using get_dirent_type() in more places...
We still use ReadDir in CheckPointLogicalRewriteHeap
because unable to read directory would result a NULL from
ReadDirExtended and we may miss to fsync the remaining map files,
so here let's error out with ReadDir.
Then why is this skipping the lstat?
Also, convert "could not parse filename" and "could not remove file"
errors to LOG messages in CheckPointLogicalRewriteHeap. This will
enable checkpoint not to waste the amount of work that it had done.
I still doubt this is a good idea.
Greetings,
Andres Freund
On Tue, Feb 15, 2022 at 09:09:52AM -0800, Andres Freund wrote:
On 2022-02-10 21:30:45 +0530, Bharath Rupireddy wrote:
Replace ReadDir with ReadDirExtended (in CheckPointSnapBuild) and
get rid of lstat entirely.I think this might be based on a slight misunderstanding / bad phrasing on my
part. We can use get_dirent_type() to optimize away the lstat on most
platforms, ReadDirExtended itself doesn't do that automatically. I was trying
to reference removing lstat calls by using get_dirent_type() in more places...We still use ReadDir in CheckPointLogicalRewriteHeap
because unable to read directory would result a NULL from
ReadDirExtended and we may miss to fsync the remaining map files,
so here let's error out with ReadDir.Then why is this skipping the lstat?
Also, convert "could not parse filename" and "could not remove file"
errors to LOG messages in CheckPointLogicalRewriteHeap. This will
enable checkpoint not to waste the amount of work that it had done.I still doubt this is a good idea.
IIUC you are advocating for something more like the attached patches.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v9-0001-make-use-of-get_dirent_type-in-replication-code.patchtext/x-diff; charset=us-asciiDownload
From 8dddbf0241258e7b52ab664e077193d5008ee6cd Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:40:53 -0800
Subject: [PATCH v9 1/3] make use of get_dirent_type() in replication code
---
src/backend/access/heap/rewriteheap.c | 4 ++--
src/backend/replication/logical/snapbuild.c | 5 ++---
src/backend/replication/slot.c | 4 ++--
3 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..d64d7aae2e 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,7 +1227,7 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, ERROR) != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..df50abfd98 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -1947,15 +1948,13 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, ERROR) != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 5da5fa825a..480a26bc7f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1428,7 +1429,6 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
if (strcmp(replication_de->d_name, ".") == 0 ||
@@ -1438,7 +1438,7 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(path, replication_de, false, ERROR) != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
--
2.25.1
v9-0002-add-error-checking-for-call-to-lstat-in-replicati.patchtext/x-diff; charset=us-asciiDownload
From 1cb3fdc3e7ed323691ec37b76b95055ad9af9757 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:46:44 -0800
Subject: [PATCH v9 2/3] add error checking for call to lstat() in replication
code
---
src/backend/replication/logical/reorderbuffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c2d9be81fa..9b9e15858e 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4812,7 +4812,11 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
sprintf(path, "pg_replslot/%s", slotname);
/* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISDIR(statbuf.st_mode))
return;
spill_dir = AllocateDir(path);
--
2.25.1
v9-0003-minor-improvements-to-replication-code.patchtext/x-diff; charset=us-asciiDownload
From 7d2bed2e783aab2506bae27d1e4df90cc1b65dd4 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:53:58 -0800
Subject: [PATCH v9 3/3] minor improvements to replication code
---
src/backend/replication/logical/snapbuild.c | 18 ++----------------
1 file changed, 2 insertions(+), 16 deletions(-)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index df50abfd98..6a60e942b7 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1965,16 +1965,10 @@ CheckPointSnapBuild(void)
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
- *
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
- {
- ereport(LOG,
+ ereport(ERROR,
(errmsg("could not parse file name \"%s\"", path)));
- continue;
- }
lsn = ((uint64) hi) << 32 | lo;
@@ -1983,19 +1977,11 @@ CheckPointSnapBuild(void)
{
elog(DEBUG1, "removing snapbuild snapshot %s", path);
- /*
- * It's not particularly harmful, though strange, if we can't
- * remove the file here. Don't prevent the checkpoint from
- * completing, that'd be a cure worse than the disease.
- */
if (unlink(path) < 0)
- {
- ereport(LOG,
+ ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",
path)));
- continue;
- }
}
}
FreeDir(snap_dir);
--
2.25.1
Hi,
On 2022-02-15 09:57:53 -0800, Nathan Bossart wrote:
IIUC you are advocating for something more like the attached patches.
At least for patch 1 yes. Don't have the cycles just now to look at the
others.
I generally think it'd be a good exercise to go through an use
get_dirent_type() in nearly all ReadDir() style loops - it's a nice efficiency
win in general, and IIRC a particularly big one on windows.
It'd probably be good to add a reference to get_dirent_type() to
ReadDir[Extended]()'s docs.
Greetings,
Andres Freund
On Tue, Feb 15, 2022 at 10:10:34AM -0800, Andres Freund wrote:
I generally think it'd be a good exercise to go through an use
get_dirent_type() in nearly all ReadDir() style loops - it's a nice efficiency
win in general, and IIRC a particularly big one on windows.It'd probably be good to add a reference to get_dirent_type() to
ReadDir[Extended]()'s docs.
Agreed. I might give this a try.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Feb 15, 2022 at 10:37:58AM -0800, Nathan Bossart wrote:
On Tue, Feb 15, 2022 at 10:10:34AM -0800, Andres Freund wrote:
I generally think it'd be a good exercise to go through an use
get_dirent_type() in nearly all ReadDir() style loops - it's a nice efficiency
win in general, and IIRC a particularly big one on windows.It'd probably be good to add a reference to get_dirent_type() to
ReadDir[Extended]()'s docs.Agreed. I might give this a try.
Alright, here is a new patch set where I've tried to replace as many
stat()/lstat() calls as possible with get_dirent_type(). 0002 and 0003 are
the same as v9. I noticed a few remaining stat()/lstat() calls that don't
appear to be doing proper error checking, but I haven't had a chance to try
fixing those yet.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v10-0001-make-more-use-of-get_dirent_type.patchtext/x-diff; charset=us-asciiDownload
From 23db5cb9bf04519b9341b6195ca224e0f8bc03cb Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:40:53 -0800
Subject: [PATCH v10 1/3] make more use of get_dirent_type()
---
src/backend/access/heap/rewriteheap.c | 4 +-
src/backend/replication/logical/snapbuild.c | 5 +--
src/backend/replication/slot.c | 4 +-
src/backend/storage/file/copydir.c | 21 +++--------
src/backend/storage/file/fd.c | 20 +++++-----
src/backend/utils/misc/guc-file.l | 42 ++++++++-------------
src/timezone/pgtz.c | 8 +---
7 files changed, 41 insertions(+), 63 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..d64d7aae2e 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,7 +1227,7 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, ERROR) != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..df50abfd98 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -1947,15 +1948,13 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, ERROR) != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 5da5fa825a..480a26bc7f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1428,7 +1429,6 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
if (strcmp(replication_de->d_name, ".") == 0 ||
@@ -1438,7 +1438,7 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(path, replication_de, false, ERROR) != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 14b77f2861..4a4a79b40a 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2783,6 +2783,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2798,6 +2802,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3234,17 +3242,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index c70543fa74..8f10252fa4 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1018,7 +1019,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1035,32 +1036,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1068,6 +1046,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.25.1
v10-0002-add-error-checking-for-call-to-lstat-in-replicat.patchtext/x-diff; charset=us-asciiDownload
From 70c3ab552e729aeb84b006d0fdfa0df0e524b277 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:46:44 -0800
Subject: [PATCH v10 2/3] add error checking for call to lstat() in replication
code
---
src/backend/replication/logical/reorderbuffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c2d9be81fa..9b9e15858e 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4812,7 +4812,11 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
sprintf(path, "pg_replslot/%s", slotname);
/* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISDIR(statbuf.st_mode))
return;
spill_dir = AllocateDir(path);
--
2.25.1
v10-0003-minor-improvements-to-replication-code.patchtext/x-diff; charset=us-asciiDownload
From d71585d0baca6c08be499fa7b6d42517e952d488 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:53:58 -0800
Subject: [PATCH v10 3/3] minor improvements to replication code
---
src/backend/replication/logical/snapbuild.c | 18 ++----------------
1 file changed, 2 insertions(+), 16 deletions(-)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index df50abfd98..6a60e942b7 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1965,16 +1965,10 @@ CheckPointSnapBuild(void)
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
- *
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
- {
- ereport(LOG,
+ ereport(ERROR,
(errmsg("could not parse file name \"%s\"", path)));
- continue;
- }
lsn = ((uint64) hi) << 32 | lo;
@@ -1983,19 +1977,11 @@ CheckPointSnapBuild(void)
{
elog(DEBUG1, "removing snapbuild snapshot %s", path);
- /*
- * It's not particularly harmful, though strange, if we can't
- * remove the file here. Don't prevent the checkpoint from
- * completing, that'd be a cure worse than the disease.
- */
if (unlink(path) < 0)
- {
- ereport(LOG,
+ ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",
path)));
- continue;
- }
}
}
FreeDir(snap_dir);
--
2.25.1
On Wed, Feb 16, 2022 at 12:11 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
On Tue, Feb 15, 2022 at 10:37:58AM -0800, Nathan Bossart wrote:
On Tue, Feb 15, 2022 at 10:10:34AM -0800, Andres Freund wrote:
I generally think it'd be a good exercise to go through an use
get_dirent_type() in nearly all ReadDir() style loops - it's a nice efficiency
win in general, and IIRC a particularly big one on windows.It'd probably be good to add a reference to get_dirent_type() to
ReadDir[Extended]()'s docs.Agreed. I might give this a try.
Alright, here is a new patch set where I've tried to replace as many
stat()/lstat() calls as possible with get_dirent_type(). 0002 and 0003 are
the same as v9. I noticed a few remaining stat()/lstat() calls that don't
appear to be doing proper error checking, but I haven't had a chance to try
fixing those yet.
0001: These get_dirent_type() conversions are nice. LGTM.
0002:
/* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (lstat(path, &statbuf) != 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not stat file \"%s\": %m", path)));
+ else if (!S_ISDIR(statbuf.st_mode))
return;
Why is this a good place to silently ignore non-directories?
StartupReorderBuffer() is already in charge of skipping random
detritus found in the directory, so would it be better to do "if
(get_dirent_type(...) != PGFILETYPE_DIR) continue" there, and then
drop the lstat() stanza from ReorderBufferCleanupSeralizedTXNs()
completely? Then perhaps its ReadDirExtended() shoud be using ERROR
instead of INFO, so that missing/non-dir/b0rked directories raise an
error. I don't understand why it's reporting readdir() errors at INFO
but unlink() errors at ERROR, and as far as I can see the other paths
that reach this code shouldn't be sending in paths to non-directories
here unless something is seriously busted and that's ERROR-worthy.
0003: I haven't studied this cure vs disease thing... no opinion today.
Thanks for taking a look!
On Thu, Mar 24, 2022 at 01:17:01PM +1300, Thomas Munro wrote:
/* we're only handling directories here, skip if it's not ours */ - if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode)) + if (lstat(path, &statbuf) != 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not stat file \"%s\": %m", path))); + else if (!S_ISDIR(statbuf.st_mode)) return;Why is this a good place to silently ignore non-directories?
StartupReorderBuffer() is already in charge of skipping random
detritus found in the directory, so would it be better to do "if
(get_dirent_type(...) != PGFILETYPE_DIR) continue" there, and then
drop the lstat() stanza from ReorderBufferCleanupSeralizedTXNs()
completely? Then perhaps its ReadDirExtended() shoud be using ERROR
instead of INFO, so that missing/non-dir/b0rked directories raise an
error.
My guess is that this was done because ReorderBufferCleanupSerializedTXNs()
is also called from ReorderBufferAllocate() and ReorderBufferFree().
However, it is odd that we just silently return if the slot path isn't a
directory in those cases. I think we could use get_dirent_type() in
StartupReorderBuffer() as you suggested, and then we could let ReadDir()
ERROR for non-directories for the other callers of
ReorderBufferCleanupSerializedTXNs(). WDYT?
I don't understand why it's reporting readdir() errors at INFO
but unlink() errors at ERROR, and as far as I can see the other paths
that reach this code shouldn't be sending in paths to non-directories
here unless something is seriously busted and that's ERROR-worthy.
I agree. I'll switch it to ReadDir() in the next revision so that we ERROR
instead of INFO.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Mar 29, 2022 at 03:48:32PM -0700, Nathan Bossart wrote:
On Thu, Mar 24, 2022 at 01:17:01PM +1300, Thomas Munro wrote:
/* we're only handling directories here, skip if it's not ours */ - if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode)) + if (lstat(path, &statbuf) != 0) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not stat file \"%s\": %m", path))); + else if (!S_ISDIR(statbuf.st_mode)) return;Why is this a good place to silently ignore non-directories?
StartupReorderBuffer() is already in charge of skipping random
detritus found in the directory, so would it be better to do "if
(get_dirent_type(...) != PGFILETYPE_DIR) continue" there, and then
drop the lstat() stanza from ReorderBufferCleanupSeralizedTXNs()
completely? Then perhaps its ReadDirExtended() shoud be using ERROR
instead of INFO, so that missing/non-dir/b0rked directories raise an
error.My guess is that this was done because ReorderBufferCleanupSerializedTXNs()
is also called from ReorderBufferAllocate() and ReorderBufferFree().
However, it is odd that we just silently return if the slot path isn't a
directory in those cases. I think we could use get_dirent_type() in
StartupReorderBuffer() as you suggested, and then we could let ReadDir()
ERROR for non-directories for the other callers of
ReorderBufferCleanupSerializedTXNs(). WDYT?I don't understand why it's reporting readdir() errors at INFO
but unlink() errors at ERROR, and as far as I can see the other paths
that reach this code shouldn't be sending in paths to non-directories
here unless something is seriously busted and that's ERROR-worthy.I agree. I'll switch it to ReadDir() in the next revision so that we ERROR
instead of INFO.
Here is an updated patch set.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v11-0001-make-more-use-of-get_dirent_type.patchtext/x-diff; charset=us-asciiDownload
From 784419919fca8f157a1c7304b79567b3ba29f3bf Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:40:53 -0800
Subject: [PATCH v11 1/2] make more use of get_dirent_type()
---
src/backend/access/heap/rewriteheap.c | 4 +-
.../replication/logical/reorderbuffer.c | 12 +++---
src/backend/replication/logical/snapbuild.c | 5 +--
src/backend/replication/slot.c | 4 +-
src/backend/storage/file/copydir.c | 21 +++-------
src/backend/storage/file/fd.c | 20 +++++----
src/backend/utils/misc/guc-file.l | 42 +++++++------------
src/timezone/pgtz.c | 8 +---
8 files changed, 48 insertions(+), 68 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..d64d7aae2e 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,7 +1227,7 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, ERROR) != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c2d9be81fa..489a8300fa 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -126,6 +126,7 @@
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
#include "commands/sequence.h"
+#include "common/file_utils.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -4806,15 +4807,10 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
{
DIR *spill_dir;
struct dirent *spill_de;
- struct stat statbuf;
char path[MAXPGPATH * 2 + 12];
sprintf(path, "pg_replslot/%s", slotname);
- /* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
- return;
-
spill_dir = AllocateDir(path);
while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
{
@@ -4862,6 +4858,7 @@ StartupReorderBuffer(void)
{
DIR *logical_dir;
struct dirent *logical_de;
+ char path[MAXPGPATH * 2 + 12];
logical_dir = AllocateDir("pg_replslot");
while ((logical_de = ReadDir(logical_dir, "pg_replslot")) != NULL)
@@ -4874,6 +4871,11 @@ StartupReorderBuffer(void)
if (!ReplicationSlotValidateName(logical_de->d_name, DEBUG2))
continue;
+ /* we're only handling directories here, skip if it's not ours */
+ sprintf(path, "pg_replslot/%s", logical_de->d_name);
+ if (get_dirent_type(path, logical_de, false, ERROR) != PGFILETYPE_DIR)
+ continue;
+
/*
* ok, has to be a surviving logical slot, iterate and delete
* everything starting with xid-*
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..df50abfd98 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -1947,15 +1948,13 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, ERROR) != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index ed4c8b3ad5..3748a5f314 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1471,7 +1472,6 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
if (strcmp(replication_de->d_name, ".") == 0 ||
@@ -1481,7 +1481,7 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(path, replication_de, false, ERROR) != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 14b77f2861..4a4a79b40a 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2783,6 +2783,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2798,6 +2802,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3234,17 +3242,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index c70543fa74..8f10252fa4 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1018,7 +1019,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1035,32 +1036,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1068,6 +1046,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.25.1
v11-0002-minor-improvements-to-replication-code.patchtext/x-diff; charset=us-asciiDownload
From 5f8a5562664a56fdd16e2983ba4f86692baa5217 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:53:58 -0800
Subject: [PATCH v11 2/2] minor improvements to replication code
---
.../replication/logical/reorderbuffer.c | 2 +-
src/backend/replication/logical/snapbuild.c | 18 ++----------------
2 files changed, 3 insertions(+), 17 deletions(-)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 489a8300fa..87c4ed3560 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4812,7 +4812,7 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
sprintf(path, "pg_replslot/%s", slotname);
spill_dir = AllocateDir(path);
- while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
+ while ((spill_de = ReadDir(spill_dir, path)) != NULL)
{
/* only look at names that can be ours */
if (strncmp(spill_de->d_name, "xid", 3) == 0)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index df50abfd98..6a60e942b7 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1965,16 +1965,10 @@ CheckPointSnapBuild(void)
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
- *
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
- {
- ereport(LOG,
+ ereport(ERROR,
(errmsg("could not parse file name \"%s\"", path)));
- continue;
- }
lsn = ((uint64) hi) << 32 | lo;
@@ -1983,19 +1977,11 @@ CheckPointSnapBuild(void)
{
elog(DEBUG1, "removing snapbuild snapshot %s", path);
- /*
- * It's not particularly harmful, though strange, if we can't
- * remove the file here. Don't prevent the checkpoint from
- * completing, that'd be a cure worse than the disease.
- */
if (unlink(path) < 0)
- {
- ereport(LOG,
+ ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",
path)));
- continue;
- }
}
}
FreeDir(snap_dir);
--
2.25.1
On Wed, Mar 30, 2022 at 09:21:30AM -0700, Nathan Bossart wrote:
Here is an updated patch set.
rebased
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v12-0001-make-more-use-of-get_dirent_type.patchtext/x-diff; charset=us-asciiDownload
From a92ccf47c9c334eea5b8d07b8fcab7031181c37e Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:40:53 -0800
Subject: [PATCH v12 1/2] make more use of get_dirent_type()
---
src/backend/access/heap/rewriteheap.c | 4 +-
.../replication/logical/reorderbuffer.c | 12 +++---
src/backend/replication/logical/snapbuild.c | 5 +--
src/backend/replication/slot.c | 4 +-
src/backend/storage/file/copydir.c | 21 +++-------
src/backend/storage/file/fd.c | 20 +++++----
src/backend/utils/misc/guc-file.l | 42 +++++++------------
src/timezone/pgtz.c | 8 +---
8 files changed, 48 insertions(+), 68 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 2a53826736..d64d7aae2e 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,7 +1227,7 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, ERROR) != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5adc016d44..63ef55f3f7 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -91,6 +91,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -4408,15 +4409,10 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
{
DIR *spill_dir;
struct dirent *spill_de;
- struct stat statbuf;
char path[MAXPGPATH * 2 + 12];
sprintf(path, "pg_replslot/%s", slotname);
- /* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
- return;
-
spill_dir = AllocateDir(path);
while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
{
@@ -4464,6 +4460,7 @@ StartupReorderBuffer(void)
{
DIR *logical_dir;
struct dirent *logical_de;
+ char path[MAXPGPATH * 2 + 12];
logical_dir = AllocateDir("pg_replslot");
while ((logical_de = ReadDir(logical_dir, "pg_replslot")) != NULL)
@@ -4476,6 +4473,11 @@ StartupReorderBuffer(void)
if (!ReplicationSlotValidateName(logical_de->d_name, DEBUG2))
continue;
+ /* we're only handling directories here, skip if it's not ours */
+ sprintf(path, "pg_replslot/%s", logical_de->d_name);
+ if (get_dirent_type(path, logical_de, false, ERROR) != PGFILETYPE_DIR)
+ continue;
+
/*
* ok, has to be a surviving logical slot, iterate and delete
* everything starting with xid-*
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..df50abfd98 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -1947,15 +1948,13 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, ERROR) != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c35ea7c35b..5ee68e71b8 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1485,7 +1486,6 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
if (strcmp(replication_de->d_name, ".") == 0 ||
@@ -1495,7 +1495,7 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(path, replication_de, false, ERROR) != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 14b77f2861..4a4a79b40a 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2783,6 +2783,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2798,6 +2802,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3234,17 +3242,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index c70543fa74..8f10252fa4 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1018,7 +1019,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1035,32 +1036,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1068,6 +1046,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.25.1
v12-0002-minor-improvements-to-replication-code.patchtext/x-diff; charset=us-asciiDownload
From 96b3c44bf9850b106751facd1e7d2d77543fe9d4 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 15 Feb 2022 09:53:58 -0800
Subject: [PATCH v12 2/2] minor improvements to replication code
---
.../replication/logical/reorderbuffer.c | 2 +-
src/backend/replication/logical/snapbuild.c | 18 ++----------------
2 files changed, 3 insertions(+), 17 deletions(-)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 63ef55f3f7..532c3e4630 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4414,7 +4414,7 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
sprintf(path, "pg_replslot/%s", slotname);
spill_dir = AllocateDir(path);
- while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
+ while ((spill_de = ReadDir(spill_dir, path)) != NULL)
{
/* only look at names that can be ours */
if (strncmp(spill_de->d_name, "xid", 3) == 0)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index df50abfd98..6a60e942b7 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1965,16 +1965,10 @@ CheckPointSnapBuild(void)
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
- *
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
- {
- ereport(LOG,
+ ereport(ERROR,
(errmsg("could not parse file name \"%s\"", path)));
- continue;
- }
lsn = ((uint64) hi) << 32 | lo;
@@ -1983,19 +1977,11 @@ CheckPointSnapBuild(void)
{
elog(DEBUG1, "removing snapbuild snapshot %s", path);
- /*
- * It's not particularly harmful, though strange, if we can't
- * remove the file here. Don't prevent the checkpoint from
- * completing, that'd be a cure worse than the disease.
- */
if (unlink(path) < 0)
- {
- ereport(LOG,
+ ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",
path)));
- continue;
- }
}
}
FreeDir(snap_dir);
--
2.25.1
On Sat, Apr 9, 2022 at 1:49 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Mar 30, 2022 at 09:21:30AM -0700, Nathan Bossart wrote:
Here is an updated patch set.
rebased
Thanks.
0001 - there are many places where lstat/stat is being used - don't we
need to replace all or most of them with get_dirent_type?
0002 - I'm not quite happy with this patch, with the change,
checkpoint errors out, if it can't remove just a file - the comments
there says it all. Is there any strong reason for this change?
- /*
- * It's not particularly harmful, though strange, if we can't
- * remove the file here. Don't prevent the checkpoint from
- * completing, that'd be a cure worse than the disease.
- */
if (unlink(path) < 0)
- {
- ereport(LOG,
+ ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",
path)));
- continue;
- }
Regards,
Bharath Rupireddy.
On Fri, Jul 08, 2022 at 09:39:10PM +0530, Bharath Rupireddy wrote:
0001 - there are many places where lstat/stat is being used - don't we
need to replace all or most of them with get_dirent_type?
It's been a while since I wrote this one, but I believe my intent was to
replace as many [l]stat() calls in ReadDir()-style loops as possible with
get_dirent_type(). Are there any that I've missed?
0002 - I'm not quite happy with this patch, with the change,
checkpoint errors out, if it can't remove just a file - the comments
there says it all. Is there any strong reason for this change?
Andres noted several concerns upthread. In short, ignoring unexpected
errors makes them harder to debug and likely masks bugs.
FWIW I agree that it is unfortunate that a relatively non-critical error
here leads to checkpoint failures, which can cause much worse problems down
the road. I think this is one of the reasons for moving tasks like this
out of the checkpointer, as I'm trying to do with the proposed custodian
process [0]https://commitfest.postgresql.org/38/3448/.
[0]: https://commitfest.postgresql.org/38/3448/
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2022-04-08 13:18:57 -0700, Nathan Bossart wrote:
@@ -1035,32 +1036,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name); canonicalize_path(filename); - if (stat(filename, &st) == 0) + de_type = get_dirent_type(filename, de, true, elevel); + if (de_type == PGFILETYPE_ERROR) { - if (!S_ISDIR(st.st_mode)) - { - /* Add file to array, increasing its size in blocks of 32 */ - if (num_filenames >= size_filenames) - { - size_filenames += 32; - filenames = (char **) repalloc(filenames, - size_filenames * sizeof(char *)); - } - filenames[num_filenames] = pstrdup(filename); - num_filenames++; - } - } - else - { - /* - * stat does not care about permissions, so the most likely reason - * a file can't be accessed now is if it was removed between the - * directory listing and now. - */ - ereport(elevel, - (errcode_for_file_access(), - errmsg("could not stat file \"%s\": %m", - filename))); record_config_file_error(psprintf("could not stat file \"%s\"", filename), calling_file, calling_lineno, @@ -1068,6 +1046,18 @@ ParseConfigDirectory(const char *includedir, status = false; goto cleanup; } + else if (de_type != PGFILETYPE_DIR) + { + /* Add file to array, increasing its size in blocks of 32 */ + if (num_filenames >= size_filenames) + { + size_filenames += 32; + filenames = (char **) repalloc(filenames, + size_filenames * sizeof(char *)); + } + filenames[num_filenames] = pstrdup(filename); + num_filenames++; + } }if (num_filenames > 0)
Seems like the diff would be easier to read if it didn't move code around as
much?
Looks pretty reasonable, I'd be happy to commit it, I think.
Greetings,
Andres Freund
On Mon, Jul 11, 2022 at 01:25:33PM -0700, Andres Freund wrote:
On 2022-04-08 13:18:57 -0700, Nathan Bossart wrote:
@@ -1035,32 +1036,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name); canonicalize_path(filename); - if (stat(filename, &st) == 0) + de_type = get_dirent_type(filename, de, true, elevel); + if (de_type == PGFILETYPE_ERROR) { - if (!S_ISDIR(st.st_mode)) - { - /* Add file to array, increasing its size in blocks of 32 */ - if (num_filenames >= size_filenames) - { - size_filenames += 32; - filenames = (char **) repalloc(filenames, - size_filenames * sizeof(char *)); - } - filenames[num_filenames] = pstrdup(filename); - num_filenames++; - } - } - else - { - /* - * stat does not care about permissions, so the most likely reason - * a file can't be accessed now is if it was removed between the - * directory listing and now. - */ - ereport(elevel, - (errcode_for_file_access(), - errmsg("could not stat file \"%s\": %m", - filename))); record_config_file_error(psprintf("could not stat file \"%s\"", filename), calling_file, calling_lineno, @@ -1068,6 +1046,18 @@ ParseConfigDirectory(const char *includedir, status = false; goto cleanup; } + else if (de_type != PGFILETYPE_DIR) + { + /* Add file to array, increasing its size in blocks of 32 */ + if (num_filenames >= size_filenames) + { + size_filenames += 32; + filenames = (char **) repalloc(filenames, + size_filenames * sizeof(char *)); + } + filenames[num_filenames] = pstrdup(filename); + num_filenames++; + } }if (num_filenames > 0)
Seems like the diff would be easier to read if it didn't move code around as
much?
I opted to reorganize in order save an indent and simplify the conditions a
bit. The alternative is something like this:
if (de_type != PGFILETYPE_ERROR)
{
if (de_type != PGTFILETYPE_DIR)
{
...
}
}
else
{
...
}
I don't feel strongly one way or another, and I'll change it if you think
it's worth it to simplify the diff.
Looks pretty reasonable, I'd be happy to commit it, I think.
Appreciate it.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Fri, Jul 8, 2022 at 10:44 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
0002 - I'm not quite happy with this patch, with the change,
checkpoint errors out, if it can't remove just a file - the comments
there says it all. Is there any strong reason for this change?Andres noted several concerns upthread. In short, ignoring unexpected
errors makes them harder to debug and likely masks bugs.FWIW I agree that it is unfortunate that a relatively non-critical error
here leads to checkpoint failures, which can cause much worse problems down
the road. I think this is one of the reasons for moving tasks like this
out of the checkpointer, as I'm trying to do with the proposed custodian
process [0].
IMHO, we can keep it as-is and if required can be changed as part of
the patch set [0], as this change without [0] can cause a checkpoint
to fail. Furthermore, I would like it if we convert "could not parse
filename" and "could not remove file" ERRORs of
CheckPointLogicalRewriteHeap to LOGs until [0] gets in - others may
have different opinions though.
Just wondering - do we ever have a problem if we can't remove the
snapshot or mapping file?
May be unrelated, RemoveTempXlogFiles doesn't even bother to check if
the temp wal file is removed.
Regards,
Bharath Rupireddy.
On Mon, Jul 18, 2022 at 04:53:18PM +0530, Bharath Rupireddy wrote:
Just wondering - do we ever have a problem if we can't remove the
snapshot or mapping file?
Besides running out of disk space, there appears to be a transaction ID
wraparound risk with the mappings files.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Working on get_dirent_type() reminded me of this thread. I was going
to commit the first of these patches, but then I noticed Andres said
he was planning to, so I'll wait another day. Here they are, with
commit messages but otherwise unchanged from Nathan's v12 except for a
slight comment tweak:
- /* we're only handling directories here, skip if it's
not ours */
+ /* we're only handling directories here, skip if it's not one */
The only hunk I'm having second thoughts about is the following, which
makes unexpected stray files break checkpoints:
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
- {
- ereport(LOG,
+ ereport(ERROR,
(errmsg("could not parse file
name \"%s\"", path)));
Bharath mentioned other places that loop over stat(), but I think
those are places that want stuff we don't already have, like st_size.
Attachments:
v13-0001-Make-more-use-of-get_dirent_type.patchtext/x-patch; charset=US-ASCII; name=v13-0001-Make-more-use-of-get_dirent_type.patchDownload
From f0be5188830655b9ce9375b6d13b1ca728dbfcdd Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Tue, 9 Aug 2022 13:11:08 +1200
Subject: [PATCH v13 1/2] Make more use of get_dirent_type().
We don't need to call stat() or lstat() in several ReadDir() loops, if
d_type is available, as it usually is on most systems. If not,
get_dirent_type() will do that under the covers.
In a couple of places this will now raise an ERROR if an internal
stat/lstat fails, where previously errors were silently ignored. ENOENT
was not expected in these locations, because the loop in question is the
one responsible for unlinking files during checkpoint, or runs at
startup.
Author: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220215231123.GA2584239%40nathanxps13
---
src/backend/access/heap/rewriteheap.c | 4 +-
.../replication/logical/reorderbuffer.c | 12 +++---
src/backend/replication/logical/snapbuild.c | 5 +--
src/backend/replication/slot.c | 4 +-
src/backend/storage/file/copydir.c | 21 +++-------
src/backend/storage/file/fd.c | 20 +++++----
src/backend/utils/misc/guc-file.l | 42 +++++++------------
src/timezone/pgtz.c | 8 +---
8 files changed, 48 insertions(+), 68 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 9dd885d936..525039d5b2 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,7 +1227,7 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, ERROR) != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fde72..f2b6cdf473 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -91,6 +91,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -4407,15 +4408,10 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
{
DIR *spill_dir;
struct dirent *spill_de;
- struct stat statbuf;
char path[MAXPGPATH * 2 + 12];
sprintf(path, "pg_replslot/%s", slotname);
- /* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
- return;
-
spill_dir = AllocateDir(path);
while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
{
@@ -4463,6 +4459,7 @@ StartupReorderBuffer(void)
{
DIR *logical_dir;
struct dirent *logical_de;
+ char path[MAXPGPATH * 2 + 12];
logical_dir = AllocateDir("pg_replslot");
while ((logical_de = ReadDir(logical_dir, "pg_replslot")) != NULL)
@@ -4475,6 +4472,11 @@ StartupReorderBuffer(void)
if (!ReplicationSlotValidateName(logical_de->d_name, DEBUG2))
continue;
+ /* we're only handling directories here, skip if it's not one */
+ sprintf(path, "pg_replslot/%s", logical_de->d_name);
+ if (get_dirent_type(path, logical_de, false, ERROR) != PGFILETYPE_DIR)
+ continue;
+
/*
* ok, has to be a surviving logical slot, iterate and delete
* everything starting with xid-*
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 73c0f15214..8587e2de43 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -1946,15 +1947,13 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, ERROR) != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 850b74936f..bc51e816f9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1442,7 +1443,6 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
if (strcmp(replication_de->d_name, ".") == 0 ||
@@ -1452,7 +1452,7 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(path, replication_de, false, ERROR) != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index efb34d4dcb..64764a287d 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2708,6 +2708,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2723,6 +2727,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3159,17 +3167,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index ce5633844c..88460422dd 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1017,7 +1018,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1034,32 +1035,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1067,6 +1045,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.35.1
v13-0002-Harden-errors-on-some-unexpected-filesystem-cond.patchtext/x-patch; charset=US-ASCII; name=v13-0002-Harden-errors-on-some-unexpected-filesystem-cond.patchDownload
From 3e60b8362eaa5fc50093c3fc59bc16773dda550a Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Tue, 9 Aug 2022 14:42:10 +1200
Subject: [PATCH v13 2/2] Harden errors on some unexpected filesystem
conditions.
CheckPointSnapBuild() shouldn't fail to unlink() files. Raise an ERROR,
there's something wrong with the filesystem. There is also no reason to
tolerate random stray files under pg_logical/snapshots.
ReorderBufferCleanupSerializedTXNs() shouldn't be called for a
non-directory, so raise an ERROR if opendir() fails, there's something
wrong with the filesystem.
Author: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220215231123.GA2584239%40nathanxps13
---
.../replication/logical/reorderbuffer.c | 2 +-
src/backend/replication/logical/snapbuild.c | 18 ++----------------
2 files changed, 3 insertions(+), 17 deletions(-)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index f2b6cdf473..a80cbacd9b 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4413,7 +4413,7 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
sprintf(path, "pg_replslot/%s", slotname);
spill_dir = AllocateDir(path);
- while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
+ while ((spill_de = ReadDir(spill_dir, path)) != NULL)
{
/* only look at names that can be ours */
if (strncmp(spill_de->d_name, "xid", 3) == 0)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 8587e2de43..bb788e1adc 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1964,16 +1964,10 @@ CheckPointSnapBuild(void)
* everything but are postfixed by .$pid.tmp. We can just remove them
* the same as other files because there can be none that are
* currently being written that are older than cutoff.
- *
- * We just log a message if a file doesn't fit the pattern, it's
- * probably some editors lock/state file or similar...
*/
if (sscanf(snap_de->d_name, "%X-%X.snap", &hi, &lo) != 2)
- {
- ereport(LOG,
+ ereport(ERROR,
(errmsg("could not parse file name \"%s\"", path)));
- continue;
- }
lsn = ((uint64) hi) << 32 | lo;
@@ -1982,19 +1976,11 @@ CheckPointSnapBuild(void)
{
elog(DEBUG1, "removing snapbuild snapshot %s", path);
- /*
- * It's not particularly harmful, though strange, if we can't
- * remove the file here. Don't prevent the checkpoint from
- * completing, that'd be a cure worse than the disease.
- */
if (unlink(path) < 0)
- {
- ereport(LOG,
+ ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",
path)));
- continue;
- }
}
}
FreeDir(snap_dir);
--
2.35.1
Thomas Munro <thomas.munro@gmail.com> writes:
The only hunk I'm having second thoughts about is the following, which
makes unexpected stray files break checkpoints:
Sounds like a pretty bad idea. What's the upside?
regards, tom lane
I wrote:
Thomas Munro <thomas.munro@gmail.com> writes:
The only hunk I'm having second thoughts about is the following, which
makes unexpected stray files break checkpoints:
Sounds like a pretty bad idea. What's the upside?
Actually, having now read the patch, I don't think there is any
part of 0002 that is a good idea. It's blithely removing the
comments that explain why the existing coding is the way it is,
and not providing a shred of justification for making checkpoints
more brittle.
I have not tried to analyze the error-handling properties of 0001,
but if it's being equally cavalier then it shouldn't be committed
either. Most of this behavior is the result of decades of hard-won
experience; discarding it because it doesn't fit conveniently
into some refactoring plan isn't smart.
regards, tom lane
On Tue, Aug 9, 2022 at 3:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Actually, having now read the patch, I don't think there is any
part of 0002 that is a good idea. It's blithely removing the
comments that explain why the existing coding is the way it is,
and not providing a shred of justification for making checkpoints
more brittle.
0002 also contradicts the original $SUBJECT and goal of this thread,
which is possibly why it was kept separate. I was only thinking of
committing 0001 myself, which is the one I'd reviewed an earlier
version of.
I have not tried to analyze the error-handling properties of 0001,
but if it's being equally cavalier then it shouldn't be committed
either. Most of this behavior is the result of decades of hard-won
experience; discarding it because it doesn't fit conveniently
into some refactoring plan isn't smart.
0001 does introduce new errors, as mentioned in the commit message, in
the form of elevel ERROR passed into get_dirent_type(), which might be
thrown if your OS has no d_type and lstat() fails (also if you asked
to follow symlinks, but in those cases errors were already thrown).
But in those cases, it seems at least a little fishy that we ignored
errors from the existing lstat(). I wondered if that was because they
expected that any failure meant ENOENT and they wanted to tolerate
that, but that does not seem to be the case, so I considered the error
to be an improvement.
Thomas Munro <thomas.munro@gmail.com> writes:
On Tue, Aug 9, 2022 at 3:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I have not tried to analyze the error-handling properties of 0001,
but if it's being equally cavalier then it shouldn't be committed
either.
0001 does introduce new errors, as mentioned in the commit message, in
the form of elevel ERROR passed into get_dirent_type(), which might be
thrown if your OS has no d_type and lstat() fails (also if you asked
to follow symlinks, but in those cases errors were already thrown).
But in those cases, it seems at least a little fishy that we ignored
errors from the existing lstat().
Hmmm ... I'll grant that ignoring lstat errors altogether isn't great.
But should the replacement behavior be elog-LOG-and-press-on,
or elog-ERROR-and-fail-the-surrounding-operation? I'm not in any
hurry to believe that the latter is more appropriate without some
analysis of what the callers are doing.
The bottom line here is that I'm distrustful of behavioral changes
introduced to simplify refactoring rather than to solve a live
problem.
regards, tom lane
On Tue, Aug 9, 2022 at 9:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thomas Munro <thomas.munro@gmail.com> writes:
On Tue, Aug 9, 2022 at 3:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I have not tried to analyze the error-handling properties of 0001,
but if it's being equally cavalier then it shouldn't be committed
either.0001 does introduce new errors, as mentioned in the commit message, in
the form of elevel ERROR passed into get_dirent_type(), which might be
thrown if your OS has no d_type and lstat() fails (also if you asked
to follow symlinks, but in those cases errors were already thrown).
But in those cases, it seems at least a little fishy that we ignored
errors from the existing lstat().Hmmm ... I'll grant that ignoring lstat errors altogether isn't great.
But should the replacement behavior be elog-LOG-and-press-on,
or elog-ERROR-and-fail-the-surrounding-operation? I'm not in any
hurry to believe that the latter is more appropriate without some
analysis of what the callers are doing.The bottom line here is that I'm distrustful of behavioral changes
introduced to simplify refactoring rather than to solve a live
problem.
+1. I agree with Tom not to change elog-LOG to elog-ERROR and fail the
checkpoint operation. Because the checkpoint is more important than
why a single snapshot file (out thousands or even million files) isn't
removed at that moment. Also, I originally proposed to change
elog-ERROR to elog-LOG in CheckPointLogicalRewriteHeap for unlink()
failures for the same reason.
--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
On 2022-08-09 15:10:41 +1200, Thomas Munro wrote:
I was going to commit the first of these patches, but then I noticed Andres
said he was planning to, so I'll wait another day.
I'd be happy for you to take this on...
On Tue, Aug 09, 2022 at 09:29:16AM +0530, Bharath Rupireddy wrote:
On Tue, Aug 9, 2022 at 9:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmmm ... I'll grant that ignoring lstat errors altogether isn't great.
But should the replacement behavior be elog-LOG-and-press-on,
or elog-ERROR-and-fail-the-surrounding-operation? I'm not in any
hurry to believe that the latter is more appropriate without some
analysis of what the callers are doing.The bottom line here is that I'm distrustful of behavioral changes
introduced to simplify refactoring rather than to solve a live
problem.+1. I agree with Tom not to change elog-LOG to elog-ERROR and fail the
checkpoint operation. Because the checkpoint is more important than
why a single snapshot file (out thousands or even million files) isn't
removed at that moment. Also, I originally proposed to change
elog-ERROR to elog-LOG in CheckPointLogicalRewriteHeap for unlink()
failures for the same reason.
This was my initial instinct as well, but this thread has received
contradictory feedback during the months since.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Aug 9, 2022 at 4:28 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Tue, Aug 09, 2022 at 09:29:16AM +0530, Bharath Rupireddy wrote:
On Tue, Aug 9, 2022 at 9:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmmm ... I'll grant that ignoring lstat errors altogether isn't great.
But should the replacement behavior be elog-LOG-and-press-on,
or elog-ERROR-and-fail-the-surrounding-operation? I'm not in any
hurry to believe that the latter is more appropriate without some
analysis of what the callers are doing.The bottom line here is that I'm distrustful of behavioral changes
introduced to simplify refactoring rather than to solve a live
problem.+1. I agree with Tom not to change elog-LOG to elog-ERROR and fail the
checkpoint operation. Because the checkpoint is more important than
The changes were not from elog-LOG to elog-ERROR, they were changes
from silently eating errors, but I take your (plural) general point.
why a single snapshot file (out thousands or even million files) isn't
removed at that moment. Also, I originally proposed to change
elog-ERROR to elog-LOG in CheckPointLogicalRewriteHeap for unlink()
failures for the same reason.This was my initial instinct as well, but this thread has received
contradictory feedback during the months since.
OK, so there aren't many places in 0001 that error out where we
previously did not.
For the one in CheckPointLogicalRewriteHeap(), if it fails, you don't
just want to skip -- it might be in the range that we need to fsync().
So what if we just deleted that get_dirent_type() != PGFILETYPE_REG
check altogether? Depending on the name range check, we'd either
attempt to unlink() and LOG if that fails, or we'd attempt to
open()-or-ERROR and then fsync()-or-PANIC. What functionality have we
lost without the DT_REG check? Well, now if someone ill-advisedly
puts an important character device, socket, symlink (ie any kind of
non-DT_REG) into that directory with a name conforming to the
unlinkable range, we'll try to unlink it and possibly succeed, where
previously we'd have skipped, and if it's in the checkpointable range,
we'd try to fsync() it and likely fail, which is good because this is
a supposed to be a checkpoint.
That's already what happens if lstat() fails in master. We fall back
to treating it like a DT_REG anyway:
if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
continue;
For the one in CheckSnapBuild(), it's similar but unlink only.
For the one in StartupReplicationSlots(), you could possibly take the
same line: if it's not a directory, we'll try an rmdir() it anyway and
then emit a WARNING if that fails. Again, that's exactly what master
would do if that lstat() failed.
In other words, if we're going to LOG and press on, maybe we should
just press on anyway. Would the error message be any less
informative? Would the risk of unlinking a character device that you
accidentally stored in
/clusters/pg16-main/pgdata/pg_logical/mappings/map-1234-5678 be a
problem for anyone?
Separately from that topic, it would be nice to have a comment in each
place where we tolerate unlink() failure to explain why that is
harmless except for wasted disk.
On Tue, Aug 09, 2022 at 05:26:39PM +1200, Thomas Munro wrote:
The changes were not from elog-LOG to elog-ERROR, they were changes
from silently eating errors, but I take your (plural) general point.
I think this was in reference to 0002. My comment was, at least.
OK, so there aren't many places in 0001 that error out where we
previously did not.For the one in CheckPointLogicalRewriteHeap(), if it fails, you don't
just want to skip -- it might be in the range that we need to fsync().
So what if we just deleted that get_dirent_type() != PGFILETYPE_REG
check altogether? Depending on the name range check, we'd either
attempt to unlink() and LOG if that fails, or we'd attempt to
open()-or-ERROR and then fsync()-or-PANIC.
It looks like CheckPointLogicalRewriteHeap() presently ERRORs when unlink()
fails. However, CheckPointSnapBuild() only LOGs when unlink() fails.
Since mappings files contain transaction IDs, persistent unlink() failures
might lead to wraparound, but I don't think failing checkpoints will help
improve matters.
Bharath's original patch changed CheckPointLogicalRewriteHeap() to LOG on
sscanf() and unlink() failures. IIUC things would work the way you suggest
if that was applied.
What functionality have we
lost without the DT_REG check? Well, now if someone ill-advisedly
puts an important character device, socket, symlink (ie any kind of
non-DT_REG) into that directory with a name conforming to the
unlinkable range, we'll try to unlink it and possibly succeed, where
previously we'd have skipped, and if it's in the checkpointable range,
we'd try to fsync() it and likely fail, which is good because this is
a supposed to be a checkpoint.
I don't know that I agree that the last part is good. Presumably we don't
want checkpointing to fail forever because there's a random non-regular
file that can't be fsync'd. But it's not clear why such files would exist
in the first place. Presumably this isn't the sort of thing that Postgres
should be expected to support.
In other words, if we're going to LOG and press on, maybe we should
just press on anyway. Would the error message be any less
informative? Would the risk of unlinking a character device that you
accidentally stored in
/clusters/pg16-main/pgdata/pg_logical/mappings/map-1234-5678 be a
problem for anyone?
Probably not.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Aug 9, 2022 at 10:57 AM Thomas Munro <thomas.munro@gmail.com> wrote:
OK, so there aren't many places in 0001 that error out where we
previously did not.
Well, that's not true I believe. The 0001 patch introduces
get_dirent_type() with elevel ERROR which means that lstat failures
are now reported as ERRORs with ereport(ERROR, as opposed to
continuing on the HEAD.
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, ERROR) != PGFILETYPE_REG)
continue;
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, ERROR) != PGFILETYPE_REG)
{
so on.
The main idea of using get_dirent_type() instead of lstat or stat is
to benefit from avoiding system calls on platforms where the directory
entry type is stored in dirent structure, but not to cause errors for
lstat or stat system calls failures where we previously didn't. I
think 0001 must be just replacing lstat or stat with get_dirent_type()
with elevel ERROR if lstat or stat failure is treated as ERROR
previously, otherwise with elevel LOG. I modified it as attached. Once
this patch gets committed, then we can continue the discussion as to
whether we make elog-LOG to elog-ERROR or vice-versa.
I'm tempted to use get_dirent_type() in RemoveXlogFile() but that
requires us to pass struct dirent * from RemoveOldXlogFiles() (as
attached 0002 for now), If done, this avoids one lstat() system call
per WAL file. IMO, that's okay to pass an extra function parameter to
RemoveXlogFile() as it is a static function and there can be many
(thousands of) WAL files at times, so saving system calls (lstat() *
number of WAL files) is definitely an advantage.
Thoughts?
--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
Attachments:
v14-0001-Make-more-use-of-get_dirent_type.patchapplication/octet-stream; name=v14-0001-Make-more-use-of-get_dirent_type.patchDownload
From 4665da8af8af1d0eefeacf3626d3651443323b12 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 10 Aug 2022 07:13:02 +0000
Subject: [PATCH v14] Make more use of get_dirent_type().
We don't need to call stat() or lstat() in several ReadDir() loops, if
d_type is available, as it usually is on most systems. If not,
get_dirent_type() will do that under the covers.
In a couple of places this will now raise an ERROR if an internal
stat/lstat fails, where previously errors were silently ignored. ENOENT
was not expected in these locations, because the loop in question is the
one responsible for unlinking files during checkpoint, or runs at
startup.
Author: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220215231123.GA2584239%40nathanxps13
---
src/backend/access/heap/rewriteheap.c | 4 +-
.../replication/logical/reorderbuffer.c | 12 +++---
src/backend/replication/logical/snapbuild.c | 5 +--
src/backend/replication/slot.c | 4 +-
src/backend/storage/file/copydir.c | 21 +++-------
src/backend/storage/file/fd.c | 20 +++++----
src/backend/utils/misc/guc-file.l | 42 +++++++------------
src/timezone/pgtz.c | 8 +---
8 files changed, 48 insertions(+), 68 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 9dd885d936..a00874f0f7 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,7 +1227,7 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, LOG) != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fde72..c37e970de5 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -91,6 +91,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -4407,15 +4408,10 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
{
DIR *spill_dir;
struct dirent *spill_de;
- struct stat statbuf;
char path[MAXPGPATH * 2 + 12];
sprintf(path, "pg_replslot/%s", slotname);
- /* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
- return;
-
spill_dir = AllocateDir(path);
while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
{
@@ -4463,6 +4459,7 @@ StartupReorderBuffer(void)
{
DIR *logical_dir;
struct dirent *logical_de;
+ char path[MAXPGPATH * 2 + 12];
logical_dir = AllocateDir("pg_replslot");
while ((logical_de = ReadDir(logical_dir, "pg_replslot")) != NULL)
@@ -4475,6 +4472,11 @@ StartupReorderBuffer(void)
if (!ReplicationSlotValidateName(logical_de->d_name, DEBUG2))
continue;
+ /* we're only handling directories here, skip if it's not one */
+ sprintf(path, "pg_replslot/%s", logical_de->d_name);
+ if (get_dirent_type(path, logical_de, false, LOG) != PGFILETYPE_DIR)
+ continue;
+
/*
* ok, has to be a surviving logical slot, iterate and delete
* everything starting with xid-*
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 73c0f15214..5ba185fa0d 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -1946,15 +1947,13 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, LOG) != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 850b74936f..dffd87b335 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1442,7 +1443,6 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
if (strcmp(replication_de->d_name, ".") == 0 ||
@@ -1452,7 +1452,7 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(path, replication_de, false, LOG) != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index efb34d4dcb..64764a287d 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2708,6 +2708,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2723,6 +2727,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3159,17 +3167,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index ce5633844c..88460422dd 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1017,7 +1018,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1034,32 +1035,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1067,6 +1045,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.34.1
v14-0002-Replace-lstat-with-get_dirent_type-in-xlog.c.patchapplication/octet-stream; name=v14-0002-Replace-lstat-with-get_dirent_type-in-xlog.c.patchDownload
From a09666d6ca3d3071b16832033b3deec6a04fd75c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 10 Aug 2022 07:12:15 +0000
Subject: [PATCH v14] Replace lstat() with get_dirent_type() in xlog.c
---
src/backend/access/transam/xlog.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 34f0150d1e..2b39b6f3a9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,8 +657,9 @@ static void PreallocXlogFiles(XLogRecPtr endptr, TimeLineID tli);
static void RemoveTempXlogFiles(void);
static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr,
XLogRecPtr endptr, TimeLineID insertTLI);
-static void RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI);
+static void RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI);
static void UpdateLastRemovedPtr(char *filename);
static void ValidateXLOGDirectoryStructure(void);
static void CleanupBackupHistory(void);
@@ -3597,7 +3598,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr,
/* Update the last removed location in shared memory first */
UpdateLastRemovedPtr(xlde->d_name);
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endlogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endlogSegNo,
insertTLI);
}
}
@@ -3670,7 +3671,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* - but seems safer to let them be archived and removed later.
*/
if (!XLogArchiveIsReady(xlde->d_name))
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endLogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endLogSegNo,
newTLI);
}
}
@@ -3692,14 +3693,14 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* should be used for this timeline.
*/
static void
-RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI)
+RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI)
{
char path[MAXPGPATH];
#ifdef WIN32
char newpath[MAXPGPATH];
#endif
- struct stat statbuf;
snprintf(path, MAXPGPATH, XLOGDIR "/%s", segname);
@@ -3711,7 +3712,7 @@ RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
if (wal_recycle &&
*endlogSegNo <= recycleSegNo &&
XLogCtl->InstallXLogFileSegmentActive && /* callee rechecks this */
- lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
+ get_dirent_type(path, xlde, false, LOG) != PGFILETYPE_REG &&
InstallXLogFileSegment(endlogSegNo, path,
true, recycleSegNo, insertTLI))
{
--
2.34.1
On Wed, Aug 10, 2022 at 7:15 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
The main idea of using get_dirent_type() instead of lstat or stat is
to benefit from avoiding system calls on platforms where the directory
entry type is stored in dirent structure, but not to cause errors for
lstat or stat system calls failures where we previously didn't.
Yeah, I get it... I'm OK with separating mechanical changes like
lstat() -> get_dirent_type() from policy changes on error handling. I
initially thought the ERROR was surely better than what we're doing
now (making extra system calls to avoid unlinking unexpected things,
for which our only strategy on failure is to try to unlink the thing
anyway), but it's better to discuss that separately.
I
think 0001 must be just replacing lstat or stat with get_dirent_type()
with elevel ERROR if lstat or stat failure is treated as ERROR
previously, otherwise with elevel LOG. I modified it as attached. Once
this patch gets committed, then we can continue the discussion as to
whether we make elog-LOG to elog-ERROR or vice-versa.
Agreed, will look at this version tomorrow.
I'm tempted to use get_dirent_type() in RemoveXlogFile() but that
requires us to pass struct dirent * from RemoveOldXlogFiles() (as
attached 0002 for now), If done, this avoids one lstat() system call
per WAL file. IMO, that's okay to pass an extra function parameter to
RemoveXlogFile() as it is a static function and there can be many
(thousands of) WAL files at times, so saving system calls (lstat() *
number of WAL files) is definitely an advantage.
- lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
+ get_dirent_type(path, xlde, false, LOG) != PGFILETYPE_REG &&
I think you wanted ==, but maybe it'd be nicer to pass in a
"recyclable" boolean to RemoveXlogFile()?
If you're looking for more, rmtree() looks like another.
On Wed, Aug 10, 2022 at 2:11 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Wed, Aug 10, 2022 at 7:15 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:The main idea of using get_dirent_type() instead of lstat or stat is
to benefit from avoiding system calls on platforms where the directory
entry type is stored in dirent structure, but not to cause errors for
lstat or stat system calls failures where we previously didn't.Yeah, I get it... I'm OK with separating mechanical changes like
lstat() -> get_dirent_type() from policy changes on error handling. I
initially thought the ERROR was surely better than what we're doing
now (making extra system calls to avoid unlinking unexpected things,
for which our only strategy on failure is to try to unlink the thing
anyway), but it's better to discuss that separately.I
think 0001 must be just replacing lstat or stat with get_dirent_type()
with elevel ERROR if lstat or stat failure is treated as ERROR
previously, otherwise with elevel LOG. I modified it as attached. Once
this patch gets committed, then we can continue the discussion as to
whether we make elog-LOG to elog-ERROR or vice-versa.Agreed, will look at this version tomorrow.
Thanks.
I'm tempted to use get_dirent_type() in RemoveXlogFile() but that
requires us to pass struct dirent * from RemoveOldXlogFiles() (as
attached 0002 for now), If done, this avoids one lstat() system call
per WAL file. IMO, that's okay to pass an extra function parameter to
RemoveXlogFile() as it is a static function and there can be many
(thousands of) WAL files at times, so saving system calls (lstat() *
number of WAL files) is definitely an advantage.- lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) && + get_dirent_type(path, xlde, false, LOG) != PGFILETYPE_REG &&I think you wanted ==, but maybe it'd be nicer to pass in a
"recyclable" boolean to RemoveXlogFile()?
Ah, my bad, I corrected it now.
If you mean to do "bool recyclable = (get_dirent_type(path, xlde,
false, LOG) == PGFILETYPE_REG)"" and pass it as parameter to
RemoveXlogFile(), then we have to do get_dirent_type calls for every
WAL file even when any of (wal_recycle && *endlogSegNo <= recycleSegNo
&& XLogCtl->InstallXLogFileSegmentActive) is false which is
unnecessary and it's better to pass dirent structure as a parameter.
If you're looking for more, rmtree() looks like another.
Are you suggesting to not use pgfnames() in rmtree() and do
opendir()-readdir()-get_dirent_type() directly? If yes, I don't think
it's a great idea because rmtree() has recursive calls and holding
opendir()-readdir() for all rmtree() recursive calls without
closedir() may eat up dynamic memory if there are deeper level
directories. I would like to leave it that way. Do you have any other
ideas in mind?
Please find the v15 patch, I merged the RemoveXlogFile() changes into 0001.
--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
Attachments:
v15-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patchapplication/octet-stream; name=v15-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patchDownload
From a6bd634321d9b42714f4e24105fe4662d45844a4 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 10 Aug 2022 09:36:07 +0000
Subject: [PATCH v15] Make more use of get_dirent_type() in place of stat() or
lstat().
We don't need to call stat() or lstat() in several ReadDir() loops, if
d_type is available, as it usually is on most systems. If not,
get_dirent_type() will do that under the covers.
Authors: Nathan Bossart <nathandbossart@gmail.com>, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220215231123.GA2584239%40nathanxps13
---
src/backend/access/heap/rewriteheap.c | 4 +-
src/backend/access/transam/xlog.c | 17 ++++----
.../replication/logical/reorderbuffer.c | 12 +++---
src/backend/replication/logical/snapbuild.c | 5 +--
src/backend/replication/slot.c | 4 +-
src/backend/storage/file/copydir.c | 21 +++-------
src/backend/storage/file/fd.c | 20 +++++----
src/backend/utils/misc/guc-file.l | 42 +++++++------------
src/timezone/pgtz.c | 8 +---
9 files changed, 57 insertions(+), 76 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 9dd885d936..a00874f0f7 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1227,7 +1227,7 @@ CheckPointLogicalRewriteHeap(void)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, mapping_de, false, LOG) != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 34f0150d1e..073c6f5378 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,8 +657,9 @@ static void PreallocXlogFiles(XLogRecPtr endptr, TimeLineID tli);
static void RemoveTempXlogFiles(void);
static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr,
XLogRecPtr endptr, TimeLineID insertTLI);
-static void RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI);
+static void RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI);
static void UpdateLastRemovedPtr(char *filename);
static void ValidateXLOGDirectoryStructure(void);
static void CleanupBackupHistory(void);
@@ -3597,7 +3598,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr,
/* Update the last removed location in shared memory first */
UpdateLastRemovedPtr(xlde->d_name);
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endlogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endlogSegNo,
insertTLI);
}
}
@@ -3670,7 +3671,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* - but seems safer to let them be archived and removed later.
*/
if (!XLogArchiveIsReady(xlde->d_name))
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endLogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endLogSegNo,
newTLI);
}
}
@@ -3692,14 +3693,14 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* should be used for this timeline.
*/
static void
-RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI)
+RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI)
{
char path[MAXPGPATH];
#ifdef WIN32
char newpath[MAXPGPATH];
#endif
- struct stat statbuf;
snprintf(path, MAXPGPATH, XLOGDIR "/%s", segname);
@@ -3711,7 +3712,7 @@ RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
if (wal_recycle &&
*endlogSegNo <= recycleSegNo &&
XLogCtl->InstallXLogFileSegmentActive && /* callee rechecks this */
- lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
+ get_dirent_type(path, xlde, false, LOG) == PGFILETYPE_REG &&
InstallXLogFileSegment(endlogSegNo, path,
true, recycleSegNo, insertTLI))
{
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 88a37fde72..c37e970de5 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -91,6 +91,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -4407,15 +4408,10 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
{
DIR *spill_dir;
struct dirent *spill_de;
- struct stat statbuf;
char path[MAXPGPATH * 2 + 12];
sprintf(path, "pg_replslot/%s", slotname);
- /* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
- return;
-
spill_dir = AllocateDir(path);
while ((spill_de = ReadDirExtended(spill_dir, path, INFO)) != NULL)
{
@@ -4463,6 +4459,7 @@ StartupReorderBuffer(void)
{
DIR *logical_dir;
struct dirent *logical_de;
+ char path[MAXPGPATH * 2 + 12];
logical_dir = AllocateDir("pg_replslot");
while ((logical_de = ReadDir(logical_dir, "pg_replslot")) != NULL)
@@ -4475,6 +4472,11 @@ StartupReorderBuffer(void)
if (!ReplicationSlotValidateName(logical_de->d_name, DEBUG2))
continue;
+ /* we're only handling directories here, skip if it's not one */
+ sprintf(path, "pg_replslot/%s", logical_de->d_name);
+ if (get_dirent_type(path, logical_de, false, LOG) != PGFILETYPE_DIR)
+ continue;
+
/*
* ok, has to be a surviving logical slot, iterate and delete
* everything starting with xid-*
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 73c0f15214..5ba185fa0d 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -1946,15 +1947,13 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
-
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (get_dirent_type(path, snap_de, false, LOG) != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 850b74936f..dffd87b335 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1442,7 +1443,6 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
if (strcmp(replication_de->d_name, ".") == 0 ||
@@ -1452,7 +1452,7 @@ StartupReplicationSlots(void)
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(path, replication_de, false, LOG) != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index efb34d4dcb..64764a287d 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2708,6 +2708,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2723,6 +2727,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3159,17 +3167,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index ce5633844c..88460422dd 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1017,7 +1018,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1034,32 +1035,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1067,6 +1045,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.34.1
On Wed, Aug 10, 2022 at 03:28:25PM +0530, Bharath Rupireddy wrote:
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name); - if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode)) + if (get_dirent_type(path, mapping_de, false, LOG) != PGFILETYPE_REG) continue;
Previously, failure to lstat() wouldn't lead to skipping the entry. With
this patch, a failure to determine the file type will cause the entry to be
skipped. This might be okay in some places (e.g., CheckPointSnapBuild())
but not in others. For example, in CheckPointLogicalRewriteHeap(), this
could cause us to skip fsync-ing a file due to a get_dirent_type() failure,
which seems bad.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Aug 17, 2022 at 2:02 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Aug 10, 2022 at 03:28:25PM +0530, Bharath Rupireddy wrote:
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name); - if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode)) + if (get_dirent_type(path, mapping_de, false, LOG) != PGFILETYPE_REG) continue;Previously, failure to lstat() wouldn't lead to skipping the entry. With
this patch, a failure to determine the file type will cause the entry to be
skipped. This might be okay in some places (e.g., CheckPointSnapBuild())
but not in others. For example, in CheckPointLogicalRewriteHeap(), this
could cause us to skip fsync-ing a file due to a get_dirent_type() failure,
which seems bad.
Hm. I corrected it in the v16 patch, please review.
--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
Attachments:
v16-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patchapplication/octet-stream; name=v16-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patchDownload
From d5dd101c7c360706fb28f977703ceb1a3aadf201 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 17 Aug 2022 07:05:01 +0000
Subject: [PATCH v16] Make more use of get_dirent_type() in place of stat() or
lstat()
We don't need to call stat() or lstat() in several ReadDir() loops, if
d_type is available, as it usually is on most systems. If not,
get_dirent_type() will do that under the covers.
Authors: Nathan Bossart, Bharath Rupireddy
Reviewed-by: Thomas Munro
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20220215231123.GA2584239%40nathanxps13
---
src/backend/access/heap/rewriteheap.c | 7 +++-
src/backend/access/transam/xlog.c | 17 ++++----
.../replication/logical/reorderbuffer.c | 28 +++++++++----
src/backend/replication/logical/snapbuild.c | 6 ++-
src/backend/replication/slot.c | 6 ++-
src/backend/storage/file/copydir.c | 21 +++-------
src/backend/storage/file/fd.c | 20 +++++----
src/backend/utils/misc/guc-file.l | 42 +++++++------------
src/timezone/pgtz.c | 8 +---
9 files changed, 78 insertions(+), 77 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 9dd885d936..5b20d1e1d5 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1221,13 +1221,16 @@ CheckPointLogicalRewriteHeap(void)
TransactionId create_xid;
uint32 hi,
lo;
+ PGFileType de_type;
if (strcmp(mapping_de->d_name, ".") == 0 ||
strcmp(mapping_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ de_type = get_dirent_type(path, mapping_de, false, LOG);
+
+ if (de_type != PGFILETYPE_ERROR && de_type != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 87b243e0d4..16363a8f84 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,8 +657,9 @@ static void PreallocXlogFiles(XLogRecPtr endptr, TimeLineID tli);
static void RemoveTempXlogFiles(void);
static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr,
XLogRecPtr endptr, TimeLineID insertTLI);
-static void RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI);
+static void RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI);
static void UpdateLastRemovedPtr(char *filename);
static void ValidateXLOGDirectoryStructure(void);
static void CleanupBackupHistory(void);
@@ -3597,7 +3598,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr,
/* Update the last removed location in shared memory first */
UpdateLastRemovedPtr(xlde->d_name);
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endlogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endlogSegNo,
insertTLI);
}
}
@@ -3670,7 +3671,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* - but seems safer to let them be archived and removed later.
*/
if (!XLogArchiveIsReady(xlde->d_name))
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endLogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endLogSegNo,
newTLI);
}
}
@@ -3692,14 +3693,14 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* should be used for this timeline.
*/
static void
-RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI)
+RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI)
{
char path[MAXPGPATH];
#ifdef WIN32
char newpath[MAXPGPATH];
#endif
- struct stat statbuf;
snprintf(path, MAXPGPATH, XLOGDIR "/%s", segname);
@@ -3711,7 +3712,7 @@ RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
if (wal_recycle &&
*endlogSegNo <= recycleSegNo &&
XLogCtl->InstallXLogFileSegmentActive && /* callee rechecks this */
- lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
+ get_dirent_type(path, xlde, false, LOG) == PGFILETYPE_REG &&
InstallXLogFileSegment(endlogSegNo, path,
true, recycleSegNo, insertTLI))
{
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 1c21a1d14b..f61c5b747e 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -91,6 +91,7 @@
#include "access/xact.h"
#include "access/xlog_internal.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/binaryheap.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -254,7 +255,8 @@ static void ReorderBufferRestoreChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
static void ReorderBufferRestoreCleanup(ReorderBuffer *rb, ReorderBufferTXN *txn);
static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
bool txn_prepared);
-static void ReorderBufferCleanupSerializedTXNs(const char *slotname);
+static void ReorderBufferCleanupSerializedTXNs(const char *slotname,
+ bool slot_dir_checked);
static void ReorderBufferSerializedPath(char *path, ReplicationSlot *slot,
TransactionId xid, XLogSegNo segno);
@@ -375,7 +377,8 @@ ReorderBufferAllocate(void)
* prior exit avoided calling ReorderBufferFree. Failure to do this can
* produce duplicated txns, and it's very cheap if there's nothing there.
*/
- ReorderBufferCleanupSerializedTXNs(NameStr(MyReplicationSlot->data.name));
+ ReorderBufferCleanupSerializedTXNs(NameStr(MyReplicationSlot->data.name),
+ false);
return buffer;
}
@@ -395,7 +398,8 @@ ReorderBufferFree(ReorderBuffer *rb)
MemoryContextDelete(context);
/* Free disk space used by unconsumed reorder buffers */
- ReorderBufferCleanupSerializedTXNs(NameStr(MyReplicationSlot->data.name));
+ ReorderBufferCleanupSerializedTXNs(NameStr(MyReplicationSlot->data.name),
+ false);
}
/*
@@ -4464,7 +4468,7 @@ ReorderBufferRestoreCleanup(ReorderBuffer *rb, ReorderBufferTXN *txn)
* prior crash or decoding session exit.
*/
static void
-ReorderBufferCleanupSerializedTXNs(const char *slotname)
+ReorderBufferCleanupSerializedTXNs(const char *slotname, bool slot_dir_checked)
{
DIR *spill_dir;
struct dirent *spill_de;
@@ -4473,8 +4477,12 @@ ReorderBufferCleanupSerializedTXNs(const char *slotname)
sprintf(path, "pg_replslot/%s", slotname);
- /* we're only handling directories here, skip if it's not ours */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ /*
+ * We're only handling directories here, skip if it's not ours. Also, skip
+ * if the caller has already performed this check.
+ */
+ if (!slot_dir_checked &&
+ lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
return;
spill_dir = AllocateDir(path);
@@ -4524,6 +4532,7 @@ StartupReorderBuffer(void)
{
DIR *logical_dir;
struct dirent *logical_de;
+ char path[MAXPGPATH * 2 + 12];
logical_dir = AllocateDir("pg_replslot");
while ((logical_de = ReadDir(logical_dir, "pg_replslot")) != NULL)
@@ -4536,11 +4545,16 @@ StartupReorderBuffer(void)
if (!ReplicationSlotValidateName(logical_de->d_name, DEBUG2))
continue;
+ /* we're only handling directories here, skip if it's not one */
+ sprintf(path, "pg_replslot/%s", logical_de->d_name);
+ if (get_dirent_type(path, logical_de, false, LOG) != PGFILETYPE_DIR)
+ continue;
+
/*
* ok, has to be a surviving logical slot, iterate and delete
* everything starting with xid-*
*/
- ReorderBufferCleanupSerializedTXNs(logical_de->d_name);
+ ReorderBufferCleanupSerializedTXNs(logical_de->d_name, true);
}
FreeDir(logical_dir);
}
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1ff2c12240..f90059bb4e 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -2049,15 +2050,16 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
+ PGFileType de_type;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
+ de_type = get_dirent_type(path, snap_de, false, LOG);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (de_type != PGFILETYPE_ERROR && de_type != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 850b74936f..e9b926d7de 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1442,17 +1443,18 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
+ PGFileType de_type;
if (strcmp(replication_de->d_name, ".") == 0 ||
strcmp(replication_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
+ de_type = get_dirent_type(path, replication_de, false, LOG);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (de_type != PGFILETYPE_ERROR && de_type != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index e3b19ca1ed..bd6f1270f9 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2706,6 +2706,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2721,6 +2725,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3157,17 +3165,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index ce5633844c..88460422dd 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1017,7 +1018,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1034,32 +1035,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1067,6 +1045,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.34.1
On Wed, Aug 17, 2022 at 12:39:26PM +0530, Bharath Rupireddy wrote:
+ /* + * We're only handling directories here, skip if it's not ours. Also, skip + * if the caller has already performed this check. + */ + if (!slot_dir_checked && + lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode)) return;
There was previous discussion about removing this stanza from
ReorderBufferCleanupSeralizedTXNs() completely [0]/messages/by-id/20220329224832.GA560657@nathanxps13. If that is still
possible, I think I would choose that over having callers specify whether
to do the directory check.
+ /* we're only handling directories here, skip if it's not one */ + sprintf(path, "pg_replslot/%s", logical_de->d_name); + if (get_dirent_type(path, logical_de, false, LOG) != PGFILETYPE_DIR) + continue;
IIUC an error in get_dirent_type() could cause slots to be skipped here,
which is a behavior change.
[0]: /messages/by-id/20220329224832.GA560657@nathanxps13
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Aug 23, 2022 at 11:37 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
On Wed, Aug 17, 2022 at 12:39:26PM +0530, Bharath Rupireddy wrote:
+ /* + * We're only handling directories here, skip if it's not ours. Also, skip + * if the caller has already performed this check. + */ + if (!slot_dir_checked && + lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode)) return;There was previous discussion about removing this stanza from
ReorderBufferCleanupSeralizedTXNs() completely [0]. If that is still
possible, I think I would choose that over having callers specify whether
to do the directory check.
From [0]:
My guess is that this was done because ReorderBufferCleanupSerializedTXNs()
is also called from ReorderBufferAllocate() and ReorderBufferFree().
However, it is odd that we just silently return if the slot path isn't a
directory in those cases. I think we could use get_dirent_type() in
StartupReorderBuffer() as you suggested, and then we could let ReadDir()
ERROR for non-directories for the other callers of
ReorderBufferCleanupSerializedTXNs(). WDYT?
Firstly, removing lstat() completely from
ReorderBufferCleanupSerializedTXNs() is a behavioural change.
ReorderBufferCleanupSerializedTXNs() returning silently to callers
ReorderBufferAllocate() and ReorderBufferFree(), when the slot path
isn't a directory completely makes sense to me because the callers are
trying to clean something and if it doesn't exist that's okay, they
can go ahead. Also, the usage of the ReorderBufferAllocate() and
ReorderBufferFree() is pretty wide and I don't want to risk the new
behaviour.
+ /* we're only handling directories here, skip if it's not one */ + sprintf(path, "pg_replslot/%s", logical_de->d_name); + if (get_dirent_type(path, logical_de, false, LOG) != PGFILETYPE_DIR) + continue;IIUC an error in get_dirent_type() could cause slots to be skipped here,
which is a behavior change.
That behaviour hasn't changed, no? Currently, if lstat() fails in
ReorderBufferCleanupSerializedTXNs() it returns to
StartupReorderBuffer()'s while loop which is in a way continuing with
the other slots, this patch does nothing new.
So, no new patch, please find the latest v16 patch at [1]/messages/by-id/CALj2ACXZPMYXrPZ+CUE0_5DKDnxyqDGRftSgPPx--PWhWB3RtQ@mail.gmail.com.
[1]: /messages/by-id/CALj2ACXZPMYXrPZ+CUE0_5DKDnxyqDGRftSgPPx--PWhWB3RtQ@mail.gmail.com
--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
On Thu, Aug 25, 2022 at 10:48:08AM +0530, Bharath Rupireddy wrote:
On Tue, Aug 23, 2022 at 11:37 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
IIUC an error in get_dirent_type() could cause slots to be skipped here,
which is a behavior change.That behaviour hasn't changed, no? Currently, if lstat() fails in
ReorderBufferCleanupSerializedTXNs() it returns to
StartupReorderBuffer()'s while loop which is in a way continuing with
the other slots, this patch does nothing new.
Are you sure? FWIW, the changes in reorderbuffer.c for
ReorderBufferCleanupSerializedTXNs() reduce the code readability, in
my opinion, so that's one less argument in favor of this change.
The gain in ParseConfigDirectory() is kind of cool.
pg_tzenumerate_next(), copydir(), RemoveXlogFile()
StartupReplicationSlots(), CheckPointLogicalRewriteHeap() and
RemovePgTempFilesInDir() seem fine, as well. At least these avoid
extra lstat() calls when the file type is unknown, which would be only
a limited number of users where some of the three DT_* are missing
(right?).
--
Michael
On Thu, Aug 25, 2022 at 5:51 PM Michael Paquier <michael@paquier.xyz> wrote:
FWIW, the changes in reorderbuffer.c for
ReorderBufferCleanupSerializedTXNs() reduce the code readability, in
my opinion, so that's one less argument in favor of this change.
Agreed. I reverted the changes.
The gain in ParseConfigDirectory() is kind of cool.
pg_tzenumerate_next(), copydir(), RemoveXlogFile()
StartupReplicationSlots(), CheckPointLogicalRewriteHeap() and
RemovePgTempFilesInDir() seem fine, as well. At least these avoid
extra lstat() calls when the file type is unknown, which would be only
a limited number of users where some of the three DT_* are missing
(right?).
Yes, the idea is to avoid lstat() system calls when possible.
PSA v17 patch with reorderbuffer.c changes reverted.
--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
Attachments:
v17-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patchapplication/octet-stream; name=v17-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patchDownload
From 3ce317e9eb4b37f79e3f822fd1ae53cb5d48b12e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 27 Aug 2022 08:29:09 +0000
Subject: [PATCH v17] Make more use of get_dirent_type() in place of stat() or
lstat()
We don't need to call stat() or lstat() in several ReadDir() loops, if
d_type is available, as it usually is on most systems. If not,
get_dirent_type() will do that under the covers.
Authors: Nathan Bossart, Bharath Rupireddy
Reviewed-by: Thomas Munro, Andres Freund
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/20220215231123.GA2584239%40nathanxps13
---
src/backend/access/heap/rewriteheap.c | 7 +++-
src/backend/access/transam/xlog.c | 17 +++++----
src/backend/replication/logical/snapbuild.c | 6 ++-
src/backend/replication/slot.c | 6 ++-
src/backend/storage/file/copydir.c | 21 +++--------
src/backend/storage/file/fd.c | 20 +++++-----
src/backend/utils/misc/guc-file.l | 42 ++++++++-------------
src/timezone/pgtz.c | 8 +---
8 files changed, 57 insertions(+), 70 deletions(-)
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 9dd885d936..5b20d1e1d5 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -113,6 +113,7 @@
#include "access/xact.h"
#include "access/xloginsert.h"
#include "catalog/catalog.h"
+#include "common/file_utils.h"
#include "lib/ilist.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1213,7 +1214,6 @@ CheckPointLogicalRewriteHeap(void)
mappings_dir = AllocateDir("pg_logical/mappings");
while ((mapping_de = ReadDir(mappings_dir, "pg_logical/mappings")) != NULL)
{
- struct stat statbuf;
Oid dboid;
Oid relid;
XLogRecPtr lsn;
@@ -1221,13 +1221,16 @@ CheckPointLogicalRewriteHeap(void)
TransactionId create_xid;
uint32 hi,
lo;
+ PGFileType de_type;
if (strcmp(mapping_de->d_name, ".") == 0 ||
strcmp(mapping_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/mappings/%s", mapping_de->d_name);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ de_type = get_dirent_type(path, mapping_de, false, LOG);
+
+ if (de_type != PGFILETYPE_ERROR && de_type != PGFILETYPE_REG)
continue;
/* Skip over files that cannot be ours. */
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 537845cada..901169946d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,8 +657,9 @@ static void PreallocXlogFiles(XLogRecPtr endptr, TimeLineID tli);
static void RemoveTempXlogFiles(void);
static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr,
XLogRecPtr endptr, TimeLineID insertTLI);
-static void RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI);
+static void RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI);
static void UpdateLastRemovedPtr(char *filename);
static void ValidateXLOGDirectoryStructure(void);
static void CleanupBackupHistory(void);
@@ -3596,7 +3597,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr,
/* Update the last removed location in shared memory first */
UpdateLastRemovedPtr(xlde->d_name);
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endlogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endlogSegNo,
insertTLI);
}
}
@@ -3669,7 +3670,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* - but seems safer to let them be archived and removed later.
*/
if (!XLogArchiveIsReady(xlde->d_name))
- RemoveXlogFile(xlde->d_name, recycleSegNo, &endLogSegNo,
+ RemoveXlogFile(xlde->d_name, xlde, recycleSegNo, &endLogSegNo,
newTLI);
}
}
@@ -3691,14 +3692,14 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
* should be used for this timeline.
*/
static void
-RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
- XLogSegNo *endlogSegNo, TimeLineID insertTLI)
+RemoveXlogFile(const char *segname, const struct dirent *xlde,
+ XLogSegNo recycleSegNo, XLogSegNo *endlogSegNo,
+ TimeLineID insertTLI)
{
char path[MAXPGPATH];
#ifdef WIN32
char newpath[MAXPGPATH];
#endif
- struct stat statbuf;
snprintf(path, MAXPGPATH, XLOGDIR "/%s", segname);
@@ -3710,7 +3711,7 @@ RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
if (wal_recycle &&
*endlogSegNo <= recycleSegNo &&
XLogCtl->InstallXLogFileSegmentActive && /* callee rechecks this */
- lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
+ get_dirent_type(path, xlde, false, LOG) == PGFILETYPE_REG &&
InstallXLogFileSegment(endlogSegNo, path,
true, recycleSegNo, insertTLI))
{
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 1ff2c12240..f90059bb4e 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -123,6 +123,7 @@
#include "access/heapam_xlog.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "replication/logical.h"
@@ -2049,15 +2050,16 @@ CheckPointSnapBuild(void)
uint32 hi;
uint32 lo;
XLogRecPtr lsn;
- struct stat statbuf;
+ PGFileType de_type;
if (strcmp(snap_de->d_name, ".") == 0 ||
strcmp(snap_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_logical/snapshots/%s", snap_de->d_name);
+ de_type = get_dirent_type(path, snap_de, false, LOG);
- if (lstat(path, &statbuf) == 0 && !S_ISREG(statbuf.st_mode))
+ if (de_type != PGFILETYPE_ERROR && de_type != PGFILETYPE_REG)
{
elog(DEBUG1, "only regular files expected: %s", path);
continue;
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 850b74936f..e9b926d7de 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -41,6 +41,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
+#include "common/file_utils.h"
#include "common/string.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -1442,17 +1443,18 @@ StartupReplicationSlots(void)
replication_dir = AllocateDir("pg_replslot");
while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
{
- struct stat statbuf;
char path[MAXPGPATH + 12];
+ PGFileType de_type;
if (strcmp(replication_de->d_name, ".") == 0 ||
strcmp(replication_de->d_name, "..") == 0)
continue;
snprintf(path, sizeof(path), "pg_replslot/%s", replication_de->d_name);
+ de_type = get_dirent_type(path, replication_de, false, LOG);
/* we're only creating directories here, skip if it's not our's */
- if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
+ if (de_type != PGFILETYPE_ERROR && de_type != PGFILETYPE_DIR)
continue;
/* we crashed while a slot was being setup or deleted, clean up */
diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c
index 658fd95ba9..8a866191e1 100644
--- a/src/backend/storage/file/copydir.c
+++ b/src/backend/storage/file/copydir.c
@@ -22,6 +22,7 @@
#include <unistd.h>
#include <sys/stat.h>
+#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/copydir.h"
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL)
{
- struct stat fst;
+ PGFileType xlde_type;
/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);
- if (lstat(fromfile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", fromfile)));
+ xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);
- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
@@ -89,8 +87,6 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, todir)) != NULL)
{
- struct stat fst;
-
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
@@ -101,12 +97,7 @@ copydir(char *fromdir, char *todir, bool recurse)
* We don't need to sync subdirectories here since the recursive
* copydir will do it before it returns
*/
- if (lstat(tofile, &fst) < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", tofile)));
-
- if (S_ISREG(fst.st_mode))
+ if (get_dirent_type(tofile, xlde, false, ERROR) == PGFILETYPE_REG)
fsync_fname(tofile, false);
}
FreeDir(xldir);
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index e3b19ca1ed..bd6f1270f9 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2706,6 +2706,10 @@ TryAgain:
*
* The pathname passed to AllocateDir must be passed to this routine too,
* but it is only used for error reporting.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDir(DIR *dir, const char *dirname)
@@ -2721,6 +2725,10 @@ ReadDir(DIR *dir, const char *dirname)
* If elevel < ERROR, returns NULL after any error. With the normal coding
* pattern, this will result in falling out of the loop immediately as
* though the directory contained no (more) entries.
+ *
+ * If you need to discover the file type of an entry, consider using
+ * get_dirent_type() (instead of stat()/lstat()) to avoid extra system calls on
+ * many platforms.
*/
struct dirent *
ReadDirExtended(DIR *dir, const char *dirname, int elevel)
@@ -3157,17 +3165,11 @@ RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok, bool unlink_all)
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
- struct stat statbuf;
+ PGFileType type = get_dirent_type(rm_path, temp_de, false, LOG);
- if (lstat(rm_path, &statbuf) < 0)
- {
- ereport(LOG,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m", rm_path)));
+ if (type == PGFILETYPE_ERROR)
continue;
- }
-
- if (S_ISDIR(statbuf.st_mode))
+ else if (type == PGFILETYPE_DIR)
{
/* recursively remove contents, then directory itself */
RemovePgTempFilesInDir(rm_path, false, true);
diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l
index ce5633844c..88460422dd 100644
--- a/src/backend/utils/misc/guc-file.l
+++ b/src/backend/utils/misc/guc-file.l
@@ -14,6 +14,7 @@
#include <ctype.h>
#include <unistd.h>
+#include "common/file_utils.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "storage/fd.h"
@@ -1017,7 +1018,7 @@ ParseConfigDirectory(const char *includedir,
while ((de = ReadDir(d, directory)) != NULL)
{
- struct stat st;
+ PGFileType de_type;
char filename[MAXPGPATH];
/*
@@ -1034,32 +1035,9 @@ ParseConfigDirectory(const char *includedir,
join_path_components(filename, directory, de->d_name);
canonicalize_path(filename);
- if (stat(filename, &st) == 0)
+ de_type = get_dirent_type(filename, de, true, elevel);
+ if (de_type == PGFILETYPE_ERROR)
{
- if (!S_ISDIR(st.st_mode))
- {
- /* Add file to array, increasing its size in blocks of 32 */
- if (num_filenames >= size_filenames)
- {
- size_filenames += 32;
- filenames = (char **) repalloc(filenames,
- size_filenames * sizeof(char *));
- }
- filenames[num_filenames] = pstrdup(filename);
- num_filenames++;
- }
- }
- else
- {
- /*
- * stat does not care about permissions, so the most likely reason
- * a file can't be accessed now is if it was removed between the
- * directory listing and now.
- */
- ereport(elevel,
- (errcode_for_file_access(),
- errmsg("could not stat file \"%s\": %m",
- filename)));
record_config_file_error(psprintf("could not stat file \"%s\"",
filename),
calling_file, calling_lineno,
@@ -1067,6 +1045,18 @@ ParseConfigDirectory(const char *includedir,
status = false;
goto cleanup;
}
+ else if (de_type != PGFILETYPE_DIR)
+ {
+ /* Add file to array, increasing its size in blocks of 32 */
+ if (num_filenames >= size_filenames)
+ {
+ size_filenames += 32;
+ filenames = (char **) repalloc(filenames,
+ size_filenames * sizeof(char *));
+ }
+ filenames[num_filenames] = pstrdup(filename);
+ num_filenames++;
+ }
}
if (num_filenames > 0)
diff --git a/src/timezone/pgtz.c b/src/timezone/pgtz.c
index 3c278dd0f3..72f9e3d5e6 100644
--- a/src/timezone/pgtz.c
+++ b/src/timezone/pgtz.c
@@ -17,6 +17,7 @@
#include <sys/stat.h>
#include <time.h>
+#include "common/file_utils.h"
#include "datatype/timestamp.h"
#include "miscadmin.h"
#include "pgtz.h"
@@ -429,7 +430,6 @@ pg_tzenumerate_next(pg_tzenum *dir)
{
struct dirent *direntry;
char fullname[MAXPGPATH * 2];
- struct stat statbuf;
direntry = ReadDir(dir->dirdesc[dir->depth], dir->dirname[dir->depth]);
@@ -447,12 +447,8 @@ pg_tzenumerate_next(pg_tzenum *dir)
snprintf(fullname, sizeof(fullname), "%s/%s",
dir->dirname[dir->depth], direntry->d_name);
- if (stat(fullname, &statbuf) != 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not stat \"%s\": %m", fullname)));
- if (S_ISDIR(statbuf.st_mode))
+ if (get_dirent_type(fullname, direntry, true, ERROR) == PGFILETYPE_DIR)
{
/* Step into the subdirectory */
if (dir->depth >= MAX_TZDIR_DEPTH - 1)
--
2.34.1
On Sat, Aug 27, 2022 at 02:06:32PM +0530, Bharath Rupireddy wrote:
PSA v17 patch with reorderbuffer.c changes reverted.
LGTM
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Sun, Aug 28, 2022 at 02:52:43PM -0700, Nathan Bossart wrote:
On Sat, Aug 27, 2022 at 02:06:32PM +0530, Bharath Rupireddy wrote:
PSA v17 patch with reorderbuffer.c changes reverted.
LGTM
Yeah, what you have here looks pretty fine to me, so this is IMO
committable. Let's wait a bit to see if there are any objections from
the others, in case.
--
Michael
Hi,
On 2022-08-27 14:06:32 +0530, Bharath Rupireddy wrote:
@@ -50,7 +51,7 @@ copydir(char *fromdir, char *todir, bool recurse)
while ((xlde = ReadDir(xldir, fromdir)) != NULL) { - struct stat fst; + PGFileType xlde_type;/* If we got a cancel signal during the copy of the directory, quit */
CHECK_FOR_INTERRUPTS();
@@ -62,18 +63,15 @@ copydir(char *fromdir, char *todir, bool recurse)
snprintf(fromfile, sizeof(fromfile), "%s/%s", fromdir, xlde->d_name);
snprintf(tofile, sizeof(tofile), "%s/%s", todir, xlde->d_name);- if (lstat(fromfile, &fst) < 0) - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not stat file \"%s\": %m", fromfile))); + xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);
It continues to make no sense to me to add behaviour changes around
error-handling as part of a conversion to get_dirent_type(). I don't at all
understand why e.g. the above change to make copydir() silently skip over
files it can't stat is ok?
Greetings,
Andres Freund
On Wed, Aug 31, 2022 at 12:14:33PM -0700, Andres Freund wrote:
On 2022-08-27 14:06:32 +0530, Bharath Rupireddy wrote:
- if (lstat(fromfile, &fst) < 0) - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not stat file \"%s\": %m", fromfile))); + xlde_type = get_dirent_type(fromfile, xlde, false, ERROR);- if (S_ISDIR(fst.st_mode))
+ if (xlde_type == PGFILETYPE_DIR)
{
/* recurse to handle subdirectories */
if (recurse)
copydir(fromfile, tofile, true);
}
- else if (S_ISREG(fst.st_mode))
+ else if (xlde_type == PGFILETYPE_REG)
copy_file(fromfile, tofile);
}
FreeDir(xldir);It continues to make no sense to me to add behaviour changes around
error-handling as part of a conversion to get_dirent_type(). I don't at all
understand why e.g. the above change to make copydir() silently skip over
files it can't stat is ok?
In this example, the call to get_dirent_type() should ERROR if the call to
lstat() fails (the "elevel" argument is set to ERROR).
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2022-08-31 12:24:28 -0700, Nathan Bossart wrote:
On Wed, Aug 31, 2022 at 12:14:33PM -0700, Andres Freund wrote:
On 2022-08-27 14:06:32 +0530, Bharath Rupireddy wrote:
It continues to make no sense to me to add behaviour changes around
error-handling as part of a conversion to get_dirent_type(). I don't at all
understand why e.g. the above change to make copydir() silently skip over
files it can't stat is ok?In this example, the call to get_dirent_type() should ERROR if the call to
lstat() fails (the "elevel" argument is set to ERROR).
Oh, oops. Skimmed code too quickly...
Greetings,
Andres Freund
On Wed, Aug 31, 2022 at 04:10:59PM +0900, Michael Paquier wrote:
Yeah, what you have here looks pretty fine to me, so this is IMO
committable. Let's wait a bit to see if there are any objections from
the others, in case.
I had a few hours and I've spent them looking at what you had here in
details, and there were a few things I have tweaked before applying
the patch. First, elevel was set to LOG for three calls of
get_dirent_type() in code paths where we want to skip entries. This
would have become very noisy, so I've switched two of them to DEBUG1
and the third one to DEBUG2 for consistency with the surroundings. A
second thing was RemoveXlogFile(), where we passed down a dirent entry
*and* its d_name. With this design, it would be easy to mess up
things and pass down a file name that does not match with its dirent
entry, so I have finished by replacing "segname" by the dirent
structure. A last thing was about the two extra comment blocks in
fd.c, and I could not convince myself that these were helpful hints so
I have removed both of them.
--
Michael
On Fri, Sep 02, 2022 at 05:09:14PM +0900, Michael Paquier wrote:
I had a few hours and I've spent them looking at what you had here in
details, and there were a few things I have tweaked before applying
the patch.
Thanks!
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Sat, Sep 3, 2022 at 3:09 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Fri, Sep 02, 2022 at 05:09:14PM +0900, Michael Paquier wrote:
I had a few hours and I've spent them looking at what you had here in
details, and there were a few things I have tweaked before applying
the patch.Thanks!
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.comHi,
Your patch require a rebase. Please provide the latest rebase patch.
=== applying patch
./v17-0001-Make-more-use-of-get_dirent_type-in-place-of-sta.patch
patching file src/backend/access/heap/rewriteheap.c
Hunk #1 FAILED at 113.
Hunk #2 FAILED at 1213.
Hunk #3 FAILED at 1221.
3 out of 3 hunks FAILED -- saving rejects to file
src/backend/access/heap/rewriteheap.c.rej
--
Ibrar Ahmed
On Sat, Sep 03, 2022 at 10:47:32AM +0500, Ibrar Ahmed wrote:
Your patch require a rebase. Please provide the latest rebase patch.
I was looking for a CF entry yesterday when I looked at this patch,
missing https://commitfest.postgresql.org/39/3496/. This has been
applied as of bfb9dfd.
--
Michael