stat() vs ERROR_DELETE_PENDING, round N + 1
Hi,
Back 2017, Michael and Magus apparently fixed a bug report[1]/messages/by-id/20160712083220.1426.58667@wrigleys.postgresql.org about
failing basebackups on Windows due to its concurrent file access
semantics:
commit 9951741bbeb3ec37ca50e9ce3df1808c931ff6a6
Author: Magnus Hagander <magnus@hagander.net>
Date: Wed Jan 4 10:48:30 2017 +0100
Attempt to handle pending-delete files on Windows
I think this has been re-broken by:
commit bed90759fcbcd72d4d06969eebab81e47326f9a2
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Oct 9 16:20:12 2020 -0400
Fix our Windows stat() emulation to handle file sizes > 4GB.
There's code in there that appears to understand the
ERROR_PENDING_DELETE stuff, but it seems to be too late, as this bit
will fail with ERROR_ACCESS_DENIED first:
/* fast not-exists check */
if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
{
_dosmaperr(GetLastError());
return -1;
}
... and if you comment that out, then the CreateFile() call will fail
and we'll return before we get to the code that purports to grok
pending deletes. I don't really understand that code, but I can
report that it's not reached.
This came up because in our work on AIO, we have extra io worker
processes that might have file handles open even in a single session
scenario like 010_pg_basebackup.pl, so we make these types of problems
more likely to hit (hence also my CF entry to fix a related problem in
DROP TABLESPACE). But that's just chance: I assume basebackup could
fail for anyone in 14 for the same reason due to any other backend
that hasn't processed a sinval to close the file yet.
Perhaps we need some combination of the old way (that apparently knew
how to detect pending deletes) and the new way (that knows about large
files)?
[1]: /messages/by-id/20160712083220.1426.58667@wrigleys.postgresql.org
On Thu, Sep 2, 2021 at 10:10 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Perhaps we need some combination of the old way (that apparently knew
how to detect pending deletes) and the new way (that knows about large
files)?
I tried that, but as far as I can tell, the old approach didn't really
work either :-(
A disruptive solution that works in my tests: we could reuse the
global barrier proposed in CF #2962. If you see EACCES, ask every
backend to close all vfds at their next CFI() and wait for them all to
finish, and then retry. If you get EACCES again it really means
EACCES, but you'll very probably get ENOENT.
The cheapest solution would be to assume EACCES really means ENOENT,
but that seems unacceptably incorrect.
I suspect it might be possible to use underdocumented/unstable NtXXX()
interfaces to get at the information, but I don't know much about
that.
Is there another way that is cheap, correct and documented?
Thomas Munro <thomas.munro@gmail.com> writes:
A disruptive solution that works in my tests: we could reuse the
global barrier proposed in CF #2962. If you see EACCES, ask every
backend to close all vfds at their next CFI() and wait for them all to
finish, and then retry. If you get EACCES again it really means
EACCES, but you'll very probably get ENOENT.
That seems quite horrid :-(. But if it works, doesn't that mean that
somewhere we are opening a problematic file without the correct
sharing flags?
regards, tom lane
On Thu, Sep 2, 2021 at 10:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thomas Munro <thomas.munro@gmail.com> writes:
A disruptive solution that works in my tests: we could reuse the
global barrier proposed in CF #2962. If you see EACCES, ask every
backend to close all vfds at their next CFI() and wait for them all to
finish, and then retry. If you get EACCES again it really means
EACCES, but you'll very probably get ENOENT.That seems quite horrid :-(. But if it works, doesn't that mean that
somewhere we are opening a problematic file without the correct
sharing flags?
I'm no expert, but not AFAICS. We managed to delete the file while
some other backend had it open, which FILE_SHARE_DELETE allowed. We
just can't open it or create a new file with the same name until it's
really gone (all handles closed).
Thomas Munro <thomas.munro@gmail.com> writes:
On Thu, Sep 2, 2021 at 10:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
That seems quite horrid :-(. But if it works, doesn't that mean that
somewhere we are opening a problematic file without the correct
sharing flags?
I'm no expert, but not AFAICS. We managed to delete the file while
some other backend had it open, which FILE_SHARE_DELETE allowed. We
just can't open it or create a new file with the same name until it's
really gone (all handles closed).
Right, but we shouldn't ever need to access such a file --- we couldn't do
so on Unix, certainly. So for the open() case, it's sufficient to return
ENOENT, and the problem is only to make sure that that's what we return if
the underlying error is ERROR_DELETE_PENDING.
It's harder if the desire is to create a new file of the same name.
I'm inclined to think that the best answer might be "if it hurts,
don't do that". We should not have such a case for ordinary relation
files or WAL files, but maybe there are individual other cases where
some redesign is indicated?
regards, tom lane
On Thu, Sep 2, 2021 at 11:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thomas Munro <thomas.munro@gmail.com> writes:
I'm no expert, but not AFAICS. We managed to delete the file while
some other backend had it open, which FILE_SHARE_DELETE allowed. We
just can't open it or create a new file with the same name until it's
really gone (all handles closed).Right, but we shouldn't ever need to access such a file --- we couldn't do
so on Unix, certainly. So for the open() case, it's sufficient to return
ENOENT, and the problem is only to make sure that that's what we return if
the underlying error is ERROR_DELETE_PENDING.
Yeah. The problem is that it still shows up in directory listings
AFAIK, so something like basebackup.c sees it, and even if it didn't,
it reads the directory, and then stats the files, and then opens the
files at different times. The non-public API way to ask for the real
reason after such a failure would apparently be to call
NtFileCreate(), which can return STATUS_DELETE_PENDING. I don't know
what complications that might involve, but I see now that we have code
that digs such non-public APIs out of ntdll.dll already (for long dead
OS versions only). Hmm.
(Another thing you can't do is delete the directory that contains such
a file, which is a problem for DROP TABLESPACE and the reason I
developed the global barrier thing.)
It's harder if the desire is to create a new file of the same name.
I'm inclined to think that the best answer might be "if it hurts,
don't do that". We should not have such a case for ordinary relation
files or WAL files, but maybe there are individual other cases where
some redesign is indicated?
I guess GetNewRelFileNode()’s dilemma branch is an example; it'd
probably be nicer to users to treat a pending-deleted file as a
collision.
if (access(rpath, F_OK) == 0)
{
/* definite collision */
collides = true;
}
else
{
/*
* Here we have a little bit of a dilemma: if errno is something
* other than ENOENT, should we declare a collision and loop? In
* practice it seems best to go ahead regardless of the errno. If
* there is a colliding file we will get an smgr failure when we
* attempt to create the new relation file.
*/
collides = false;
}
On Fri, Sep 3, 2021 at 12:44 AM Thomas Munro <thomas.munro@gmail.com> wrote:
NtFileCreate()
Erm, that's spelled NtCreateFile. I see Michael mentioned this
before[1]/messages/by-id/a9c76882-27c7-9c92-7843-21d5521b70a9@postgrespro.ru; I don't think it's only available in kernel mode though,
the docs[2]https://docs.microsoft.com/en-us/windows/win32/api/winternl/nf-winternl-ntcreatefile say "This function is the user-mode equivalent to the
ZwCreateFile function", and other open source user space stuff is
using it. It's explicitly internal and subject to change though,
hence my desire to avoid it.
[1]: /messages/by-id/a9c76882-27c7-9c92-7843-21d5521b70a9@postgrespro.ru
[2]: https://docs.microsoft.com/en-us/windows/win32/api/winternl/nf-winternl-ntcreatefile
At Fri, 3 Sep 2021 01:01:43 +1200, Thomas Munro <thomas.munro@gmail.com> wrote in
On Fri, Sep 3, 2021 at 12:44 AM Thomas Munro <thomas.munro@gmail.com> wrote:
NtFileCreate()
Erm, that's spelled NtCreateFile. I see Michael mentioned this
before[1]; I don't think it's only available in kernel mode though,
the docs[2] say "This function is the user-mode equivalent to the
ZwCreateFile function", and other open source user space stuff is
using it. It's explicitly internal and subject to change though,
hence my desire to avoid it.[1] /messages/by-id/a9c76882-27c7-9c92-7843-21d5521b70a9@postgrespro.ru
[2] https://docs.microsoft.com/en-us/windows/win32/api/winternl/nf-winternl-ntcreatefile
Might be stupid, if a delete-pending'ed file can obstruct something,
couldn't we change unlink on Windows to rename to a temporary random
name then remove it? We do something like it explicitly while WAL
file removal. (It may cause degradation on bulk file deletion, and we
may need further fix so that such being-deleted files are excluded
while running a directory scan, though..)
However, looking [1]/messages/by-id/002101d79fc2$c96dff60$5c49fe20$@gmail.com, with that strategy there may be a case where
such "deleted" files may be left alone forever, though.
[1]: /messages/by-id/002101d79fc2$c96dff60$5c49fe20$@gmail.com
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Fri, Sep 3, 2021 at 2:01 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
Might be stupid, if a delete-pending'ed file can obstruct something,
couldn't we change unlink on Windows to rename to a temporary random
name then remove it? We do something like it explicitly while WAL
file removal. (It may cause degradation on bulk file deletion, and we
may need further fix so that such being-deleted files are excluded
while running a directory scan, though..)However, looking [1], with that strategy there may be a case where
such "deleted" files may be left alone forever, though.
It's a good idea. I tested it and it certainly does fix the
basebackup problem I've seen (experimental patch attached). But,
yeah, I'm also a bit worried that that path could be fragile and need
special handling in lots of places.
I also tried writing a new open() wrapper using the lower level
NtCreateFile() interface, and then an updated stat() wrapper built on
top of that. As a non-Windows person, getting that to (mostly) work
involved a fair amount of suffering. I can share that if someone is
interested, but while learning about that family of interfaces, I
realised we could keep the existing Win32-based code, but also
retrieve the NT status, leading to a very small change (experimental
patch attached).
The best idea is probably to set FILE_DISPOSITION_DELETE |
FILE_DISPOSITION_POSIX_SEMANTICS before unlinking. This appears to be
a silver bullet, but isn't available on ancient Windows releases that
we support, or file systems other than local NTFS. So maybe we need a
combination of that + STATUS_DELETE_PENDING as shown in the attached.
I'll look into that next.
Attachments:
0001-Fix-Windows-basebackup-by-renaming-before-unlinking.patchtext/x-patch; charset=US-ASCII; name=0001-Fix-Windows-basebackup-by-renaming-before-unlinking.patchDownload
From 5990d14bb4f9a0ca83fd1b6c7c6e0a6a23786a0d Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 5 Sep 2021 17:07:44 +1200
Subject: [PATCH] Fix Windows basebackup by renaming before unlinking.
Rename files before unlinking on Windows. This solves a problem where
basebackup could fail because other backends have a file handle to a
deleted file, so stat() and open() fail with ERROR_ACCESS_DENIED.
Suggested by Kyotaro Horiguchi <horikyota.ntt@gmail.com>.
XXX This is only a rough sketch to try the idea out!
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
src/backend/replication/basebackup.c | 7 +++++
src/backend/storage/file/fd.c | 38 +++++++++++++++++++++++++---
src/backend/storage/smgr/md.c | 6 ++---
src/include/storage/fd.h | 1 +
4 files changed, 46 insertions(+), 6 deletions(-)
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index e71d7406ed..7961a7041b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -19,6 +19,7 @@
#include "access/xlog_internal.h" /* for pg_start/stop_backup */
#include "catalog/pg_type.h"
#include "common/file_perm.h"
+#include "common/string.h"
#include "commands/progress.h"
#include "lib/stringinfo.h"
#include "libpq/libpq.h"
@@ -1268,6 +1269,12 @@ sendDir(const char *path, int basepathlen, bool sizeonly, List *tablespaces,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
continue;
+#ifdef WIN32
+ /* Skip unlinked files */
+ if (pg_str_endswith(de->d_name, ".unlinked"))
+ continue;
+#endif
+
/*
* Check if the postmaster has signaled us to exit, and abort with an
* error in that case. The error handler further up will call
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 501652654e..43f24d4ceb 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -72,6 +72,7 @@
#include "postgres.h"
+#include <common/string.h>
#include <dirent.h>
#include <sys/file.h>
#include <sys/param.h>
@@ -683,6 +684,35 @@ pg_truncate(const char *path, off_t length)
#endif
}
+/*
+ * Unlink a file. On Windows, rename to a temporary filename before unlinking
+ * so that "path" is available for new files immediately even if other backends
+ * hold descriptors to the one we unlink.
+ */
+int
+pg_unlink(const char *path)
+{
+#ifdef WIN32
+ for (;;)
+ {
+ char tmp_path[MAXPGPATH];
+
+ snprintf(tmp_path, sizeof(tmp_path), "%s.%ld.unlinked", path, random());
+ if (!MoveFileEx(path, tmp_path, 0))
+ {
+ if (GetLastError() == ERROR_FILE_EXISTS)
+ continue; /* try a new random number */
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ return unlink(tmp_path);
+ }
+#else
+ return unlink(path);
+#endif
+}
+
/*
* fsync_fname -- fsync a file or directory, handling errors properly
*
@@ -3269,8 +3299,9 @@ CleanupTempFiles(bool isCommit, bool isProcExit)
* postmaster session
*
* This should be called during postmaster startup. It will forcibly
- * remove any leftover files created by OpenTemporaryFile and any leftover
- * temporary relation files created by mdcreate.
+ * remove any leftover files created by OpenTemporaryFile, any leftover
+ * temporary relation files created by mdcreate, and anything left
+ * behind by pg_unlink() on crash.
*
* During post-backend-crash restart cycle, this routine is called when
* remove_temp_files_after_crash GUC is enabled. Multiple crashes while
@@ -3448,7 +3479,8 @@ RemovePgTempRelationFilesInDbspace(const char *dbspacedirname)
while ((de = ReadDirExtended(dbspace_dir, dbspacedirname, LOG)) != NULL)
{
- if (!looks_like_temp_rel_name(de->d_name))
+ if (!looks_like_temp_rel_name(de->d_name) &&
+ !pg_str_endswith(de->d_name, ".unlinked"))
continue;
snprintf(rm_path, sizeof(rm_path), "%s/%s",
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 8592d47e95..d519855d59 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -374,7 +374,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
/* Next unlink the file, unless it was already found to be missing */
if (ret == 0 || errno != ENOENT)
{
- ret = unlink(path);
+ ret = pg_unlink(path);
if (ret < 0 && errno != ENOENT)
ereport(WARNING,
(errcode_for_file_access(),
@@ -422,7 +422,7 @@ mdunlinkfork(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
register_forget_request(rnode, forkNum, segno);
}
- if (unlink(segpath) < 0)
+ if (pg_unlink(segpath) < 0)
{
/* ENOENT is expected after the last segment... */
if (errno != ENOENT)
@@ -1753,7 +1753,7 @@ mdunlinkfiletag(const FileTag *ftag, char *path)
pfree(p);
/* Try to unlink the file. */
- return unlink(path);
+ return pg_unlink(path);
}
/*
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 329c7eba9a..be38f6a979 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -189,6 +189,7 @@ extern ssize_t pg_pwritev_with_retry(int fd,
int iovcnt,
off_t offset);
extern int pg_truncate(const char *path, off_t length);
+extern int pg_unlink(const char *path);
extern void fsync_fname(const char *fname, bool isdir);
extern int fsync_fname_ext(const char *fname, bool isdir, bool ignore_perm, int elevel);
extern int durable_rename(const char *oldfile, const char *newfile, int loglevel);
--
2.30.2
0001-Handle-STATUS_DELETE_PENDING-on-Windows.patchtext/x-patch; charset=US-ASCII; name=0001-Handle-STATUS_DELETE_PENDING-on-Windows.patchDownload
From 10768af742ccf3c7dcdc8373a7b7f693edff6b81 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 5 Sep 2021 23:49:23 +1200
Subject: [PATCH] Handle STATUS_DELETE_PENDING on Windows.
1. Update our open() wrapper to check for STATUS_DELETE_PENDING and
translate it to appropriate errors.
2. Remove non-working code that intended to do the same thing in our
stat() wrapper, and build on top of open() instead.
XXX Work in progress, see TODO notes in the code
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
src/include/port.h | 1 +
src/port/open.c | 72 +++++++++++------
src/port/win32stat.c | 180 ++++---------------------------------------
3 files changed, 65 insertions(+), 188 deletions(-)
diff --git a/src/include/port.h b/src/include/port.h
index 82f63de325..88b2711121 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -290,6 +290,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+#define O_X_STAT 0x40000000
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
#define open(a,b,c) pgwin32_open(a,b,c)
diff --git a/src/port/open.c b/src/port/open.c
index 14c6debba9..de976fc7e3 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -23,6 +23,34 @@
#include <assert.h>
#include <sys/stat.h>
+#include <ntstatus.h>
+
+typedef NTSTATUS (__stdcall *RtlGetLastNtStatus_t)(void);
+
+static RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+static int
+initialize_ntdll(void)
+{
+ HMODULE module;
+
+ if (pg_RtlGetLastNtStatus)
+ return 0;
+ module = LoadLibraryEx("ntdll.dll", NULL, 0);
+ if (!module)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ pg_RtlGetLastNtStatus = (RtlGetLastNtStatus_t) (pg_funcptr_t)
+ GetProcAddress(module, "RtlGetLastNtStatus");
+ if (!pg_RtlGetLastNtStatus)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ return 0;
+}
static int
openFlagsToCreateFileFlags(int openFlags)
@@ -66,13 +94,18 @@ pgwin32_open(const char *fileName, int fileFlags,...)
SECURITY_ATTRIBUTES sa;
int loops = 0;
+ if (initialize_ntdll() < 0)
+ return -1;
+
/* Check that we can handle the request */
assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
(O_RANDOM | O_SEQUENTIAL | O_TEMPORARY) |
_O_SHORT_LIVED | O_DSYNC | O_DIRECT |
+ O_X_STAT |
(O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
#ifndef FRONTEND
- Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
+ /* XXX When called by stat very early on, this fails! */
+ //Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
#endif
#ifdef FRONTEND
@@ -102,6 +135,7 @@ pgwin32_open(const char *fileName, int fileFlags,...)
&sa,
openFlagsToCreateFileFlags(fileFlags),
FILE_ATTRIBUTE_NORMAL |
+ ((fileFlags & O_X_STAT) ? FILE_FLAG_BACKUP_SEMANTICS : 0) |
((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
@@ -140,32 +174,20 @@ pgwin32_open(const char *fileName, int fileFlags,...)
/*
* ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
* gone (Windows NT status code is STATUS_DELETE_PENDING). In that
- * case we want to wait a bit and try again, giving up after 1 second
- * (since this condition should never persist very long). However,
- * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
- * so care is needed. In particular that happens if we try to open a
- * directory, or of course if there's an actual file-permissions
- * problem. To distinguish these cases, try a stat(). In the
- * delete-pending case, it will either also get STATUS_DELETE_PENDING,
- * or it will see the file as gone and fail with ENOENT. In other
- * cases it will usually succeed. The only somewhat-likely case where
- * this coding will uselessly wait is if there's a permissions problem
- * with a containing directory, which we hope will never happen in any
- * performance-critical code paths.
+ * case, we'd better ask for the NT status too so we can translate it
+ * to a more Unix-like error.
+ *
+ * If there's no O_CREAT flag, then we'll pretend the file is
+ * invisible. With O_CREAT, we have no choice but to report that
+ * there's a file in the way (which wouldn't happen on Unix).
*/
- if (err == ERROR_ACCESS_DENIED)
+ if (err == ERROR_ACCESS_DENIED &&
+ pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
{
- if (loops < 10)
- {
- struct stat st;
-
- if (stat(fileName, &st) != 0)
- {
- pg_usleep(100000);
- loops++;
- continue;
- }
- }
+ if (fileFlags & O_CREAT)
+ err = ERROR_FILE_EXISTS;
+ else
+ err = ERROR_FILE_NOT_FOUND;
}
_dosmaperr(err);
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 2ad8ee1359..efe5c3221c 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -18,53 +18,6 @@
#include "c.h"
#include <windows.h>
-/*
- * In order to support MinGW and MSVC2013 we use NtQueryInformationFile as an
- * alternative for GetFileInformationByHandleEx. It is loaded from the ntdll
- * library.
- */
-#if _WIN32_WINNT < 0x0600
-#include <winternl.h>
-
-#if !defined(__MINGW32__) && !defined(__MINGW64__)
-/* MinGW includes this in <winternl.h>, but it is missing in MSVC */
-typedef struct _FILE_STANDARD_INFORMATION
-{
- LARGE_INTEGER AllocationSize;
- LARGE_INTEGER EndOfFile;
- ULONG NumberOfLinks;
- BOOLEAN DeletePending;
- BOOLEAN Directory;
-} FILE_STANDARD_INFORMATION;
-#define FileStandardInformation 5
-#endif /* !defined(__MINGW32__) &&
- * !defined(__MINGW64__) */
-
-typedef NTSTATUS (NTAPI * PFN_NTQUERYINFORMATIONFILE)
- (IN HANDLE FileHandle,
- OUT PIO_STATUS_BLOCK IoStatusBlock,
- OUT PVOID FileInformation,
- IN ULONG Length,
- IN FILE_INFORMATION_CLASS FileInformationClass);
-
-static PFN_NTQUERYINFORMATIONFILE _NtQueryInformationFile = NULL;
-
-static HMODULE ntdll = NULL;
-
-/*
- * Load DLL file just once regardless of how many functions we load/call in it.
- */
-static void
-LoadNtdll(void)
-{
- if (ntdll != NULL)
- return;
- ntdll = LoadLibraryEx("ntdll.dll", NULL, 0);
-}
-
-#endif /* _WIN32_WINNT < 0x0600 */
-
-
/*
* Convert a FILETIME struct into a 64 bit time_t.
*/
@@ -161,125 +114,31 @@ fileinfo_to_stat(HANDLE hFile, struct stat *buf)
int
_pgstat64(const char *name, struct stat *buf)
{
+ int fd;
+ int rc;
+ int save_errno;
+
/*
- * We must use a handle so lstat() returns the information of the target
- * file. To have a reliable test for ERROR_DELETE_PENDING, we use
- * NtQueryInformationFile from Windows 2000 or
- * GetFileInformationByHandleEx from Server 2008 / Vista.
+ * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We pass
+ * in a special private flag to say that it's _pgstat64() calling, to
+ * activate a mode that allows directories to be opened for limited
+ * purposes.
+ *
+ * XXX Think about fd pressure, since we're opening an fd?
*/
- SECURITY_ATTRIBUTES sa;
- HANDLE hFile;
- int ret;
-#if _WIN32_WINNT < 0x0600
- IO_STATUS_BLOCK ioStatus;
- FILE_STANDARD_INFORMATION standardInfo;
-#else
- FILE_STANDARD_INFO standardInfo;
-#endif
-
- if (name == NULL || buf == NULL)
- {
- errno = EINVAL;
- return -1;
- }
+ fd = pgwin32_open(name, O_RDONLY | O_X_STAT, 0);
- /* fast not-exists check */
- if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
- {
- _dosmaperr(GetLastError());
+ if (fd < 0)
return -1;
- }
- /* get a file handle as lightweight as we can */
- sa.nLength = sizeof(SECURITY_ATTRIBUTES);
- sa.bInheritHandle = TRUE;
- sa.lpSecurityDescriptor = NULL;
- hFile = CreateFile(name,
- GENERIC_READ,
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- OPEN_EXISTING,
- (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
- FILE_FLAG_OVERLAPPED),
- NULL);
- if (hFile == INVALID_HANDLE_VALUE)
- {
- DWORD err = GetLastError();
+ rc = _pgfstat64(fd, buf);
+ save_errno = errno;
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-
- memset(&standardInfo, 0, sizeof(standardInfo));
-
-#if _WIN32_WINNT < 0x0600
- if (_NtQueryInformationFile == NULL)
- {
- /* First time through: load ntdll.dll and find NtQueryInformationFile */
- LoadNtdll();
- if (ntdll == NULL)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-
- _NtQueryInformationFile = (PFN_NTQUERYINFORMATIONFILE) (pg_funcptr_t)
- GetProcAddress(ntdll, "NtQueryInformationFile");
- if (_NtQueryInformationFile == NULL)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
- }
+ close(fd);
- if (!NT_SUCCESS(_NtQueryInformationFile(hFile, &ioStatus, &standardInfo,
- sizeof(standardInfo),
- FileStandardInformation)))
- {
- DWORD err = GetLastError();
+ errno = save_errno;
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#else
- if (!GetFileInformationByHandleEx(hFile, FileStandardInfo, &standardInfo,
- sizeof(standardInfo)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#endif /* _WIN32_WINNT < 0x0600 */
-
- if (standardInfo.DeletePending)
- {
- /*
- * File has been deleted, but is not gone from the filesystem yet.
- * This can happen when some process with FILE_SHARE_DELETE has it
- * open, and it will be fully removed once that handle is closed.
- * Meanwhile, we can't open it, so indicate that the file just doesn't
- * exist.
- */
- CloseHandle(hFile);
- errno = ENOENT;
- return -1;
- }
-
- /* At last we can invoke fileinfo_to_stat */
- ret = fileinfo_to_stat(hFile, buf);
-
- CloseHandle(hFile);
- return ret;
+ return rc;
}
/*
@@ -296,11 +155,6 @@ _pgfstat64(int fileno, struct stat *buf)
return -1;
}
- /*
- * Since we already have a file handle there is no need to check for
- * ERROR_DELETE_PENDING.
- */
-
return fileinfo_to_stat(hFile, buf);
}
--
2.30.2
Hi,
On 2021-09-06 01:32:55 +1200, Thomas Munro wrote:
It's a good idea. I tested it and it certainly does fix the
basebackup problem I've seen (experimental patch attached). But,
yeah, I'm also a bit worried that that path could be fragile and need
special handling in lots of places.
It's also expensive-ish.
I also tried writing a new open() wrapper using the lower level
NtCreateFile() interface, and then an updated stat() wrapper built on
top of that. As a non-Windows person, getting that to (mostly) work
involved a fair amount of suffering. I can share that if someone is
interested, but while learning about that family of interfaces, I
realised we could keep the existing Win32-based code, but also
retrieve the NT status, leading to a very small change (experimental
patch attached).
Is it guaranteed, or at least reliable, that the status we fetch with
RtlGetLastNtStatus is actually from the underlying filesystem operation,
rather than some other work that happens during the win32->nt translation?
E.g. a memory allocation or such? Presumably most of such work happens before
the actual nt "syscall", but ...
The best idea is probably to set FILE_DISPOSITION_DELETE |
FILE_DISPOSITION_POSIX_SEMANTICS before unlinking. This appears to be
a silver bullet, but isn't available on ancient Windows releases that
we support, or file systems other than local NTFS. So maybe we need a
combination of that + STATUS_DELETE_PENDING as shown in the attached.
I'll look into that next.
When was that introduced?
I'd be ok to only fix these bugs on e.g. Win10, Win Server 2019, Win Server
2016 or such. I don't think we need to support OSs that the vendor doesn't
support - and I wouldn't count "only security fixes" as support in this
context.
main extended
Windows 10 Oct 14, 2025
Windows Server 2019 Jan 9, 2024 Jan 9, 2029
Windows Server 2016 Jan 11, 2022 Jan 12, 2027
Windows 7 Jan 13, 2015 Jan 14, 2020
Windows Vista Apr 10, 2012 Apr 11, 2017
One absurd detail here is that the deault behaviour changed sometime in
Windows 10's lifetime:
https://stackoverflow.com/questions/60424732/did-the-behaviour-of-deleted-files-open-with-fileshare-delete-change-on-windows
"The behavior changed in recent releases of Windows 10 -- without notice
AFAIK. DeleteFileW now tries to use POSIX semantics if the filesystem supports
it. NTFS does."
#ifndef FRONTEND - Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */ + /* XXX When called by stat very early on, this fails! */ + //Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */ #endif
Perhaps we should move the win32 signal initialization into StartupHacks()?
There's some tension around it using ereport(), and MemoryContextInit() only
being called a tad later, but that seems resolvable.
+ * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We pass + * in a special private flag to say that it's _pgstat64() calling, to + * activate a mode that allows directories to be opened for limited + * purposes. + * + * XXX Think about fd pressure, since we're opening an fd? */
If I understand https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/getmaxstdio?view=msvc-160
etc correctly, it looks like there is. But only at the point we do
_open_osfhandle(). So perhaps we should a pgwin32_open() version returning a
handle and make pgwin32_open() a wrapper around that?
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2021-09-06 01:32:55 +1200, Thomas Munro wrote:
The best idea is probably to set FILE_DISPOSITION_DELETE |
FILE_DISPOSITION_POSIX_SEMANTICS before unlinking. This appears to be
a silver bullet, but isn't available on ancient Windows releases that
we support, or file systems other than local NTFS. So maybe we need a
combination of that + STATUS_DELETE_PENDING as shown in the attached.
I'll look into that next.
When was that introduced?
Googling says that it was introduced in Win10, although in RS2 (version
1703, general availability in 4/2017) not the initial release.
I'd be ok to only fix these bugs on e.g. Win10, Win Server 2019, Win Server
2016 or such.
Yeah, particularly if the fix is trivial on the newer systems and
incredibly complicated otherwise. Between the effort needed and
the risk of introducing new bugs, I'm really not excited about
an invasive fix for this.
regards, tom lane
On Mon, Sep 6, 2021 at 9:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
I'd be ok to only fix these bugs on e.g. Win10, Win Server 2019, Win Server
2016 or such.Yeah, particularly if the fix is trivial on the newer systems and
incredibly complicated otherwise. Between the effort needed and
the risk of introducing new bugs, I'm really not excited about
an invasive fix for this.
If DeleteFile() is automatically using
FILE_DISPOSITION_POSIX_SEMANTICS by default when possible on recent
releases as per the SO link that Andres posted above ("18363.657
definitely has the new behavior"), then that's great news and maybe we
shouldn't even bother to try to request that mode ourselves explicitly
(eg in some kind of unlink wrapper). Then we'd need just one
accomodation for older systems and non-NTFS systems, not two, and I
currently think that should be the short and sweet approach shown in
0001-Handle-STATUS_DELETE_PENDING-on-Windows.patch, with some tidying
and adjustments per feedback.
If DeleteFile() is automatically using
FILE_DISPOSITION_POSIX_SEMANTICS by default when possible on recent
releases as per the SO link that Andres posted above ("18363.657
definitely has the new behavior"), then that's great news and maybe we
shouldn't even bother to try to request that mode ourselves explicitly
(eg in some kind of unlink wrapper). Then we'd need just one
accomodation for older systems and non-NTFS systems, not two, and I
currently think that should be the short and sweet approach shown in
0001-Handle-STATUS_DELETE_PENDING-on-Windows.patch, with some tidying
and adjustments per feedback.
Having a non-invasive fix for this long-standing issue would be really
great, even if that means reducing the scope of systems where this can
be fixed.
The last time I poked at the bear (54fb8c7d), there was a test posted
by Alexander Lakhin that was really useful in making sure that
concurrency is correctly handled when a file is unlinked:
/messages/by-id/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com
It worked with VS but not on MinGW. How does your patch react to this
test?
--
Michael
On Mon, Sep 6, 2021 at 9:44 AM Andres Freund <andres@anarazel.de> wrote:
Is it guaranteed, or at least reliable, that the status we fetch with
RtlGetLastNtStatus is actually from the underlying filesystem operation,
rather than some other work that happens during the win32->nt translation?
E.g. a memory allocation or such? Presumably most of such work happens before
the actual nt "syscall", but ...
I don't know. I know at least that it's thread-local, so that's
something. I guess it's plausible that CreateFile() might want to
free a temporary buffer that it used for conversion to NT pathname
format, and whatever code it uses to do that might clobber the NT
status. Nothing like that seems to happen in common cases though, and
I guess it would also be clobbered on success. Frustrating.
Alright then, here also is the version that bypasses CreateFile() and
goes straight to NtCreateFile(). This way, the status can't possibly
be clobbered before we see it, but maybe there are other risks due to
using a much wider set of unstable ntdll interfaces...
Both versions pass all tests on CI, including the basebackup one in a
scenario where an unlinked file has an open descriptor, but still need
a bit more tidying.
"The behavior changed in recent releases of Windows 10 -- without notice
AFAIK. DeleteFileW now tries to use POSIX semantics if the filesystem supports
it. NTFS does."
Nice find. I wonder if this applies also to rename()...
#ifndef FRONTEND - Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */ + /* XXX When called by stat very early on, this fails! */ + //Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */ #endifPerhaps we should move the win32 signal initialization into StartupHacks()?
There's some tension around it using ereport(), and MemoryContextInit() only
being called a tad later, but that seems resolvable.
The dependencies among open(), pg_usleep(),
pgwin32_signal_initialize() and read_backend_variables() are not very
nice. I don't have a fix for that yet.
+ * XXX Think about fd pressure, since we're opening an fd?
If I understand https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/getmaxstdio?view=msvc-160
etc correctly, it looks like there is. But only at the point we do
_open_osfhandle(). So perhaps we should a pgwin32_open() version returning a
handle and make pgwin32_open() a wrapper around that?
Yeah. Done, in both variants.
I haven't tried it, but I suspect the difference between stat() and
lstat() could be handled with FILE_OPEN_REPARSE_POINT (as
NtCreateFile() calls it) or FILE_FLAG_OPEN_REPARSE_POINT (as
CreateFile() calls it).
Attachments:
v2-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchDownload
From 123bac7edb71ef27748187132d0912e617da5b44 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 5 Sep 2021 23:49:23 +1200
Subject: [PATCH v2] Check for STATUS_DELETE_PENDING on Windows.
1. Update our open() wrapper to check for NT's STATUS_DELETE_PENDING
and translate it to appropriate errors.
2. Remove non-working code that was trying to do something similar for
stat(), and just reuse the open() code.
XXX TODO: resolve order-of-operations problem: pgwin32_open_handle() is
caled by _pgstat64() before the Windows signal emulation is started up,
but it might call pg_usleep().
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
configure | 6 ++
configure.ac | 1 +
src/include/port.h | 1 +
src/include/port/win32ntdll.h | 22 +++++
src/port/open.c | 101 +++++++++++----------
src/port/win32ntdll.c | 63 +++++++++++++
src/port/win32stat.c | 164 ++--------------------------------
src/tools/msvc/Mkvcbuild.pm | 3 +-
8 files changed, 156 insertions(+), 205 deletions(-)
create mode 100644 src/include/port/win32ntdll.h
create mode 100644 src/port/win32ntdll.c
diff --git a/configure b/configure
index c550cacd5a..adfe03f3f2 100755
--- a/configure
+++ b/configure
@@ -16818,6 +16818,12 @@ esac
;;
esac
+ case " $LIBOBJS " in
+ *" win32ntdll.$ac_objext "* ) ;;
+ *) LIBOBJS="$LIBOBJS win32ntdll.$ac_objext"
+ ;;
+esac
+
case " $LIBOBJS " in
*" win32security.$ac_objext "* ) ;;
*) LIBOBJS="$LIBOBJS win32security.$ac_objext"
diff --git a/configure.ac b/configure.ac
index 2ee710102f..2819b91a8c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1922,6 +1922,7 @@ if test "$PORTNAME" = "win32"; then
AC_LIBOBJ(system)
AC_LIBOBJ(win32env)
AC_LIBOBJ(win32error)
+ AC_LIBOBJ(win32ntdll)
AC_LIBOBJ(win32security)
AC_LIBOBJ(win32setlocale)
AC_LIBOBJ(win32stat)
diff --git a/src/include/port.h b/src/include/port.h
index 82f63de325..ec64be429c 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -290,6 +290,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
#define open(a,b,c) pgwin32_open(a,b,c)
diff --git a/src/include/port/win32ntdll.h b/src/include/port/win32ntdll.h
new file mode 100644
index 0000000000..76b2becf8b
--- /dev/null
+++ b/src/include/port/win32ntdll.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32_ntdll.h
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/win32ntdll.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include <ntstatus.h>
+
+typedef NTSTATUS (__stdcall *RtlGetLastNtStatus_t)(void);
+
+extern RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+extern int initialize_ntdll(void);
diff --git a/src/port/open.c b/src/port/open.c
index 14c6debba9..8fddd45c57 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -19,6 +19,8 @@
#include "postgres_fe.h"
#endif
+#include "port/win32ntdll.h"
+
#include <fcntl.h>
#include <assert.h>
#include <sys/stat.h>
@@ -56,37 +58,28 @@ openFlagsToCreateFileFlags(int openFlags)
}
/*
- * - file attribute setting, based on fileMode?
+ * Internal function used by pgwin32_open() and _pgstat64(). When
+ * backup_semantics is true, directories may be opened (for limited uses). On
+ * failure, INVALID_HANDLE_VALUE is returned and errno is set.
*/
-int
-pgwin32_open(const char *fileName, int fileFlags,...)
+HANDLE
+pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
{
- int fd;
- HANDLE h = INVALID_HANDLE_VALUE;
+ HANDLE h;
SECURITY_ATTRIBUTES sa;
int loops = 0;
+ if (initialize_ntdll() < 0)
+ return INVALID_HANDLE_VALUE;
+
/* Check that we can handle the request */
assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
(O_RANDOM | O_SEQUENTIAL | O_TEMPORARY) |
_O_SHORT_LIVED | O_DSYNC | O_DIRECT |
(O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
#ifndef FRONTEND
- Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
-#endif
-
-#ifdef FRONTEND
-
- /*
- * Since PostgreSQL 12, those concurrent-safe versions of open() and
- * fopen() can be used by frontends, having as side-effect to switch the
- * file-translation mode from O_TEXT to O_BINARY if none is specified.
- * Caller may want to enforce the binary or text mode, but if nothing is
- * defined make sure that the default mode maps with what versions older
- * than 12 have been doing.
- */
- if ((fileFlags & O_BINARY) == 0)
- fileFlags |= O_TEXT;
+ /* XXX When called by stat very early on, this fails! */
+ //Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
#endif
sa.nLength = sizeof(sa);
@@ -102,6 +95,7 @@ pgwin32_open(const char *fileName, int fileFlags,...)
&sa,
openFlagsToCreateFileFlags(fileFlags),
FILE_ATTRIBUTE_NORMAL |
+ (backup_semantics ? FILE_FLAG_BACKUP_SEMANTICS : 0) |
((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
@@ -140,38 +134,55 @@ pgwin32_open(const char *fileName, int fileFlags,...)
/*
* ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
* gone (Windows NT status code is STATUS_DELETE_PENDING). In that
- * case we want to wait a bit and try again, giving up after 1 second
- * (since this condition should never persist very long). However,
- * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
- * so care is needed. In particular that happens if we try to open a
- * directory, or of course if there's an actual file-permissions
- * problem. To distinguish these cases, try a stat(). In the
- * delete-pending case, it will either also get STATUS_DELETE_PENDING,
- * or it will see the file as gone and fail with ENOENT. In other
- * cases it will usually succeed. The only somewhat-likely case where
- * this coding will uselessly wait is if there's a permissions problem
- * with a containing directory, which we hope will never happen in any
- * performance-critical code paths.
+ * case, we'd better ask for the NT status too so we can translate it
+ * to a more Unix-like error. We hope that nothing clobbers the NT
+ * status in between the internal NtCreateFile() call and CreateFile()
+ * returning.
+ *
+ * If there's no O_CREAT flag, then we'll pretend the file is
+ * invisible. With O_CREAT, we have no choice but to report that
+ * there's a file in the way (which wouldn't happen on Unix).
*/
- if (err == ERROR_ACCESS_DENIED)
+ if (err == ERROR_ACCESS_DENIED &&
+ pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
{
- if (loops < 10)
- {
- struct stat st;
-
- if (stat(fileName, &st) != 0)
- {
- pg_usleep(100000);
- loops++;
- continue;
- }
- }
+ if (fileFlags & O_CREAT)
+ err = ERROR_FILE_EXISTS;
+ else
+ err = ERROR_FILE_NOT_FOUND;
}
_dosmaperr(err);
- return -1;
+ return INVALID_HANDLE_VALUE;
}
+ return h;
+}
+
+int
+pgwin32_open(const char *fileName, int fileFlags,...)
+{
+ HANDLE h;
+ int fd;
+
+ h = pgwin32_open_handle(fileName, fileFlags, false);
+ if (h == INVALID_HANDLE_VALUE)
+ return -1;
+
+#ifdef FRONTEND
+
+ /*
+ * Since PostgreSQL 12, those concurrent-safe versions of open() and
+ * fopen() can be used by frontends, having as side-effect to switch the
+ * file-translation mode from O_TEXT to O_BINARY if none is specified.
+ * Caller may want to enforce the binary or text mode, but if nothing is
+ * defined make sure that the default mode maps with what versions older
+ * than 12 have been doing.
+ */
+ if ((fileFlags & O_BINARY) == 0)
+ fileFlags |= O_TEXT;
+#endif
+
/* _open_osfhandle will, on error, set errno accordingly */
if ((fd = _open_osfhandle((intptr_t) h, fileFlags & O_APPEND)) < 0)
CloseHandle(h); /* will not affect errno */
diff --git a/src/port/win32ntdll.c b/src/port/win32ntdll.c
new file mode 100644
index 0000000000..eaadafbe9a
--- /dev/null
+++ b/src/port/win32ntdll.c
@@ -0,0 +1,63 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.c
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/port/win32ntdll.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include "port/win32ntdll.h"
+
+RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+int
+initialize_ntdll(void)
+{
+ static bool initialized;
+ HMODULE module;
+
+ static const struct {
+ const char *name;
+ pg_funcptr_t *address;
+ } routines[] = {
+ {"RtlGetLastNtStatus", (pg_funcptr_t *) &pg_RtlGetLastNtStatus}
+ };
+
+ if (initialized)
+ return 0;
+
+ if (!(module = LoadLibraryEx("ntdll.dll", NULL, 0)))
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ for (int i = 0; i < lengthof(routines); ++i)
+ {
+ pg_funcptr_t address;
+
+ address = (pg_funcptr_t) GetProcAddress(module, routines[i].name);
+ if (!address)
+ {
+ _dosmaperr(GetLastError());
+ FreeLibrary(module);
+
+ return -1;
+ }
+
+ *(pg_funcptr_t *) routines[i].address = address;
+ }
+
+ initialized = true;
+
+ return 0;
+}
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 2ad8ee1359..c851400dc8 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -18,53 +18,6 @@
#include "c.h"
#include <windows.h>
-/*
- * In order to support MinGW and MSVC2013 we use NtQueryInformationFile as an
- * alternative for GetFileInformationByHandleEx. It is loaded from the ntdll
- * library.
- */
-#if _WIN32_WINNT < 0x0600
-#include <winternl.h>
-
-#if !defined(__MINGW32__) && !defined(__MINGW64__)
-/* MinGW includes this in <winternl.h>, but it is missing in MSVC */
-typedef struct _FILE_STANDARD_INFORMATION
-{
- LARGE_INTEGER AllocationSize;
- LARGE_INTEGER EndOfFile;
- ULONG NumberOfLinks;
- BOOLEAN DeletePending;
- BOOLEAN Directory;
-} FILE_STANDARD_INFORMATION;
-#define FileStandardInformation 5
-#endif /* !defined(__MINGW32__) &&
- * !defined(__MINGW64__) */
-
-typedef NTSTATUS (NTAPI * PFN_NTQUERYINFORMATIONFILE)
- (IN HANDLE FileHandle,
- OUT PIO_STATUS_BLOCK IoStatusBlock,
- OUT PVOID FileInformation,
- IN ULONG Length,
- IN FILE_INFORMATION_CLASS FileInformationClass);
-
-static PFN_NTQUERYINFORMATIONFILE _NtQueryInformationFile = NULL;
-
-static HMODULE ntdll = NULL;
-
-/*
- * Load DLL file just once regardless of how many functions we load/call in it.
- */
-static void
-LoadNtdll(void)
-{
- if (ntdll != NULL)
- return;
- ntdll = LoadLibraryEx("ntdll.dll", NULL, 0);
-}
-
-#endif /* _WIN32_WINNT < 0x0600 */
-
-
/*
* Convert a FILETIME struct into a 64 bit time_t.
*/
@@ -162,120 +115,18 @@ int
_pgstat64(const char *name, struct stat *buf)
{
/*
- * We must use a handle so lstat() returns the information of the target
- * file. To have a reliable test for ERROR_DELETE_PENDING, we use
- * NtQueryInformationFile from Windows 2000 or
- * GetFileInformationByHandleEx from Server 2008 / Vista.
+ * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We
+ * request FILE_FLAG_BACKUP_SEMANTICS so that we can open directories too,
+ * for limited purposes. We use the private handle-based version, so we
+ * don't risk running out of fds.
*/
- SECURITY_ATTRIBUTES sa;
HANDLE hFile;
int ret;
-#if _WIN32_WINNT < 0x0600
- IO_STATUS_BLOCK ioStatus;
- FILE_STANDARD_INFORMATION standardInfo;
-#else
- FILE_STANDARD_INFO standardInfo;
-#endif
-
- if (name == NULL || buf == NULL)
- {
- errno = EINVAL;
- return -1;
- }
- /* fast not-exists check */
- if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
- {
- _dosmaperr(GetLastError());
- return -1;
- }
-
- /* get a file handle as lightweight as we can */
- sa.nLength = sizeof(SECURITY_ATTRIBUTES);
- sa.bInheritHandle = TRUE;
- sa.lpSecurityDescriptor = NULL;
- hFile = CreateFile(name,
- GENERIC_READ,
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- OPEN_EXISTING,
- (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
- FILE_FLAG_OVERLAPPED),
- NULL);
+ hFile = pgwin32_open_handle(name, O_RDONLY, true);
if (hFile == INVALID_HANDLE_VALUE)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
return -1;
- }
-
- memset(&standardInfo, 0, sizeof(standardInfo));
-
-#if _WIN32_WINNT < 0x0600
- if (_NtQueryInformationFile == NULL)
- {
- /* First time through: load ntdll.dll and find NtQueryInformationFile */
- LoadNtdll();
- if (ntdll == NULL)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-
- _NtQueryInformationFile = (PFN_NTQUERYINFORMATIONFILE) (pg_funcptr_t)
- GetProcAddress(ntdll, "NtQueryInformationFile");
- if (_NtQueryInformationFile == NULL)
- {
- DWORD err = GetLastError();
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
- }
-
- if (!NT_SUCCESS(_NtQueryInformationFile(hFile, &ioStatus, &standardInfo,
- sizeof(standardInfo),
- FileStandardInformation)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#else
- if (!GetFileInformationByHandleEx(hFile, FileStandardInfo, &standardInfo,
- sizeof(standardInfo)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#endif /* _WIN32_WINNT < 0x0600 */
-
- if (standardInfo.DeletePending)
- {
- /*
- * File has been deleted, but is not gone from the filesystem yet.
- * This can happen when some process with FILE_SHARE_DELETE has it
- * open, and it will be fully removed once that handle is closed.
- * Meanwhile, we can't open it, so indicate that the file just doesn't
- * exist.
- */
- CloseHandle(hFile);
- errno = ENOENT;
- return -1;
- }
-
- /* At last we can invoke fileinfo_to_stat */
ret = fileinfo_to_stat(hFile, buf);
CloseHandle(hFile);
@@ -296,11 +147,6 @@ _pgfstat64(int fileno, struct stat *buf)
return -1;
}
- /*
- * Since we already have a file handle there is no need to check for
- * ERROR_DELETE_PENDING.
- */
-
return fileinfo_to_stat(hFile, buf);
}
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 84f15f7e85..bebb0578dc 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -107,7 +107,8 @@ sub mkvcbuild
pg_strong_random.c pgcheckdir.c pgmkdirp.c pgsleep.c pgstrcasecmp.c
pqsignal.c mkdtemp.c qsort.c qsort_arg.c bsearch_arg.c quotes.c system.c
strerror.c tar.c thread.c
- win32env.c win32error.c win32security.c win32setlocale.c win32stat.c);
+ win32env.c win32error.c win32ntdll.c
+ win32security.c win32setlocale.c win32stat.c);
push(@pgportfiles, 'strtof.c') if ($vsVersion < '14.00');
--
2.30.2
0001-Use-NtCreateFile-to-open-files-on-Windows.patchtext/x-patch; charset=US-ASCII; name=0001-Use-NtCreateFile-to-open-files-on-Windows.patchDownload
From 4959e4dc6d96d7634ec5ac0c8f55f11ec37b6adb Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 5 Sep 2021 23:49:23 +1200
Subject: [PATCH] Use NtCreateFile() to open files on Windows.
1. Update our open() wrapper to bypass Win32 and use NT interfaces to
open files. This allows us to check for NT's STATUS_DELETE_PENDING and
translate it to appropriate errors.
2. Remove non-working code from our stat() wrapper that tried to handle
the same problem, and reuse the open() wrapper.
XXX TODO: resolve order-of-operations problem: pgwin32_open_handle() is
caled by _pgstat64() before the Windows signal emulation is started up,
but it might call pg_usleep().
XXX TODO: there are some macro redefinition warnings from MSVC on CI;
apparently we need some well placed #define and #undef of
WIN32_NO_STATUS.
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
configure | 6 +
configure.ac | 1 +
src/include/port.h | 1 +
src/include/port/win32ntdll.h | 42 ++++++
src/port/open.c | 255 ++++++++++++++++++++--------------
src/port/win32ntdll.c | 68 +++++++++
src/port/win32stat.c | 164 +---------------------
src/tools/msvc/Mkvcbuild.pm | 3 +-
8 files changed, 276 insertions(+), 264 deletions(-)
create mode 100644 src/include/port/win32ntdll.h
create mode 100644 src/port/win32ntdll.c
diff --git a/configure b/configure
index c550cacd5a..adfe03f3f2 100755
--- a/configure
+++ b/configure
@@ -16818,6 +16818,12 @@ esac
;;
esac
+ case " $LIBOBJS " in
+ *" win32ntdll.$ac_objext "* ) ;;
+ *) LIBOBJS="$LIBOBJS win32ntdll.$ac_objext"
+ ;;
+esac
+
case " $LIBOBJS " in
*" win32security.$ac_objext "* ) ;;
*) LIBOBJS="$LIBOBJS win32security.$ac_objext"
diff --git a/configure.ac b/configure.ac
index 2ee710102f..2819b91a8c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1922,6 +1922,7 @@ if test "$PORTNAME" = "win32"; then
AC_LIBOBJ(system)
AC_LIBOBJ(win32env)
AC_LIBOBJ(win32error)
+ AC_LIBOBJ(win32ntdll)
AC_LIBOBJ(win32security)
AC_LIBOBJ(win32setlocale)
AC_LIBOBJ(win32stat)
diff --git a/src/include/port.h b/src/include/port.h
index 82f63de325..ec64be429c 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -290,6 +290,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
#define open(a,b,c) pgwin32_open(a,b,c)
diff --git a/src/include/port/win32ntdll.h b/src/include/port/win32ntdll.h
new file mode 100644
index 0000000000..14e70c6f42
--- /dev/null
+++ b/src/include/port/win32ntdll.h
@@ -0,0 +1,42 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32_ntdll.h
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/win32ntdll.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include <ntstatus.h>
+#include <winternl.h>
+
+typedef NTSTATUS (__stdcall *NtCreateFile_t)(PHANDLE FileHandle,
+ ACCESS_MASK DesiredAccess,
+ POBJECT_ATTRIBUTES ObjectAttributes,
+ PIO_STATUS_BLOCK IoStatusBlock,
+ PLARGE_INTEGER AllocationSize,
+ ULONG FileAttributes,
+ ULONG ShareAccess,
+ ULONG CreateDisposition,
+ ULONG CreateOptions,
+ PVOID EaBuffer,
+ ULONG EaLength);
+
+typedef BOOLEAN (__stdcall *RtlDosPathNameToNtPathName_U_t)(PCWSTR DosName,
+ PUNICODE_STRING NtName,
+ PCWSTR *PartName,
+ void *RelativeName);
+
+typedef ULONG (__stdcall *RtlNtStatusToDosError_t)(NTSTATUS Status);
+
+extern NtCreateFile_t pg_NtCreateFile;
+extern RtlDosPathNameToNtPathName_U_t pg_RtlDosPathNameToNtPathName_U;
+extern RtlNtStatusToDosError_t pg_RtlNtStatusToDosError;
+
+extern int initialize_ntdll(void);
diff --git a/src/port/open.c b/src/port/open.c
index 14c6debba9..122e1b4ab7 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -19,52 +19,38 @@
#include "postgres_fe.h"
#endif
+#include "port/win32ntdll.h"
+
#include <fcntl.h>
#include <assert.h>
#include <sys/stat.h>
-
-static int
-openFlagsToCreateFileFlags(int openFlags)
-{
- switch (openFlags & (O_CREAT | O_TRUNC | O_EXCL))
- {
- /* O_EXCL is meaningless without O_CREAT */
- case 0:
- case O_EXCL:
- return OPEN_EXISTING;
-
- case O_CREAT:
- return OPEN_ALWAYS;
-
- /* O_EXCL is meaningless without O_CREAT */
- case O_TRUNC:
- case O_TRUNC | O_EXCL:
- return TRUNCATE_EXISTING;
-
- case O_CREAT | O_TRUNC:
- return CREATE_ALWAYS;
-
- /* O_TRUNC is meaningless with O_CREAT */
- case O_CREAT | O_EXCL:
- case O_CREAT | O_TRUNC | O_EXCL:
- return CREATE_NEW;
- }
-
- /* will never get here */
- return 0;
-}
-
/*
- * - file attribute setting, based on fileMode?
+ * Internal function used by pgwin32_open() and _pgstat64(). When
+ * backup_semantics is true, directories may be opened (for limited uses). On
+ * failure, INVALID_HANDLE_VALUE is returned and errno is set.
*/
-int
-pgwin32_open(const char *fileName, int fileFlags,...)
+HANDLE
+pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
{
- int fd;
HANDLE h = INVALID_HANDLE_VALUE;
- SECURITY_ATTRIBUTES sa;
int loops = 0;
+ wchar_t wideFileName[MAX_PATH];
+ size_t wideFileNameSize;
+ wchar_t buffer[MAX_PATH];
+ UNICODE_STRING ntFileName;
+ ACCESS_MASK desiredAccess;
+ OBJECT_ATTRIBUTES objectAttributes;
+ IO_STATUS_BLOCK ioStatusBlock;
+ ULONG fileAttributes;
+ ULONG shareAccess;
+ ULONG createDisposition;
+ ULONG createOptions;
+ NTSTATUS status;
+ DWORD err;
+
+ if (initialize_ntdll() < 0)
+ return INVALID_HANDLE_VALUE;
/* Check that we can handle the request */
assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
@@ -72,51 +58,116 @@ pgwin32_open(const char *fileName, int fileFlags,...)
_O_SHORT_LIVED | O_DSYNC | O_DIRECT |
(O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
#ifndef FRONTEND
- Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
+ /* XXX When called by stat very early on, this fails! */
+ //Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
#endif
-#ifdef FRONTEND
+ /* Convert char string to wchar_t string. */
+ wideFileNameSize = MultiByteToWideChar(CP_ACP, 0, fileName, -1,
+ wideFileName,
+ lengthof(wideFileName));
+ if (wideFileNameSize == 0)
+ {
+ _dosmaperr(GetLastError());
+ return INVALID_HANDLE_VALUE;
+ }
- /*
- * Since PostgreSQL 12, those concurrent-safe versions of open() and
- * fopen() can be used by frontends, having as side-effect to switch the
- * file-translation mode from O_TEXT to O_BINARY if none is specified.
- * Caller may want to enforce the binary or text mode, but if nothing is
- * defined make sure that the default mode maps with what versions older
- * than 12 have been doing.
- */
- if ((fileFlags & O_BINARY) == 0)
- fileFlags |= O_TEXT;
-#endif
+ /* Convert DOS/Win32 path to fully qualified NT path. */
+ ntFileName.Length = 0;
+ ntFileName.MaximumLength = sizeof(buffer);
+ ntFileName.Buffer = buffer;
+ if (!pg_RtlDosPathNameToNtPathName_U(wideFileName,
+ &ntFileName,
+ NULL,
+ NULL))
+ {
+ /* XXX does GetLastError() work for this function? need the
+ * _WithStatus version maybe? */
+ _dosmaperr(GetLastError());
+ return INVALID_HANDLE_VALUE;
+ }
- sa.nLength = sizeof(sa);
- sa.bInheritHandle = TRUE;
- sa.lpSecurityDescriptor = NULL;
-
- while ((h = CreateFile(fileName,
- /* cannot use O_RDONLY, as it == 0 */
- (fileFlags & O_RDWR) ? (GENERIC_WRITE | GENERIC_READ) :
- ((fileFlags & O_WRONLY) ? GENERIC_WRITE : GENERIC_READ),
- /* These flags allow concurrent rename/unlink */
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- openFlagsToCreateFileFlags(fileFlags),
- FILE_ATTRIBUTE_NORMAL |
- ((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
- ((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
- ((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
- ((fileFlags & O_TEMPORARY) ? FILE_FLAG_DELETE_ON_CLOSE : 0) |
- ((fileFlags & O_DIRECT) ? FILE_FLAG_NO_BUFFERING : 0) |
- ((fileFlags & O_DSYNC) ? FILE_FLAG_WRITE_THROUGH : 0),
- NULL)) == INVALID_HANDLE_VALUE)
+ /* Convert other parameters. */
+ desiredAccess =
+ FILE_READ_ATTRIBUTES | /* allow fstat() even if O_WRONLY */
+ ((fileFlags & O_TEMPORARY) ? DELETE : 0) |
+ ((fileFlags & O_RDWR) ? FILE_GENERIC_WRITE | FILE_GENERIC_READ :
+ (fileFlags & O_WRONLY) ? FILE_GENERIC_WRITE : FILE_GENERIC_READ);
+ shareAccess = FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE;
+ InitializeObjectAttributes(&objectAttributes, &ntFileName,
+ OBJ_CASE_INSENSITIVE | OBJ_INHERIT, NULL, NULL);
+ fileAttributes =
+ ((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0);
+ if (fileAttributes == 0)
+ fileAttributes = FILE_ATTRIBUTE_NORMAL;
+ if (fileFlags & O_CREAT)
{
+ if (fileFlags & O_EXCL)
+ createDisposition = FILE_CREATE;
+ else if (fileFlags & O_TRUNC)
+ createDisposition = FILE_OVERWRITE_IF;
+ else
+ createDisposition = FILE_OPEN_IF;
+ }
+ else if (fileFlags & O_TRUNC)
+ createDisposition = FILE_OVERWRITE;
+ else
+ createDisposition = FILE_OPEN;
+ createOptions =
+ FILE_SYNCHRONOUS_IO_NONALERT |
+ (backup_semantics ? FILE_OPEN_FOR_BACKUP_INTENT : 0) |
+ ((fileFlags & O_RANDOM) ? FILE_RANDOM_ACCESS : 0) |
+ ((fileFlags & O_SEQUENTIAL) ? FILE_SEQUENTIAL_ONLY : 0) |
+ ((fileFlags & O_DIRECT) ? FILE_NO_INTERMEDIATE_BUFFERING : 0) |
+ ((fileFlags & O_DSYNC) ? FILE_WRITE_THROUGH : 0) |
+ ((fileFlags & O_TEMPORARY) ? FILE_DELETE_ON_CLOSE : 0);
+
+ for (;;)
+ {
+ status = pg_NtCreateFile(&h,
+ desiredAccess,
+ &objectAttributes,
+ &ioStatusBlock,
+ NULL,
+ fileAttributes,
+ shareAccess,
+ createDisposition,
+ createOptions,
+ NULL,
+ 0);
+
+ if (NT_SUCCESS(status))
+ break;
+
+ /*
+ * Translate STATUS_DELETE_PENDING to a more Unix-like error.
+ *
+ * If there's no O_CREAT flag, then we'll pretend the file is
+ * invisible. With O_CREAT, we have no choice but to report that
+ * there's a file in the way (which wouldn't happen on Unix).
+ */
+ if (status == STATUS_DELETE_PENDING)
+ {
+ if (fileFlags & O_CREAT)
+ err = ERROR_FILE_EXISTS;
+ else
+ err = ERROR_FILE_NOT_FOUND;
+ _dosmaperr(err);
+ return INVALID_HANDLE_VALUE;
+ }
+
+ /*
+ * For everything else, convert the NT error into a DOS error. This
+ * is where STATUS_DELETE_PENDING would normally be mapped to
+ * ERROR_ACCESS_DENIED, and lost to us.
+ */
+ err = pg_RtlNtStatusToDosError(status);
+
/*
* Sharing violation or locking error can indicate antivirus, backup
* or similar software that's locking the file. Wait a bit and try
* again, giving up after 30 seconds.
*/
- DWORD err = GetLastError();
-
if (err == ERROR_SHARING_VIOLATION ||
err == ERROR_LOCK_VIOLATION)
{
@@ -137,41 +188,37 @@ pgwin32_open(const char *fileName, int fileFlags,...)
}
}
- /*
- * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
- * gone (Windows NT status code is STATUS_DELETE_PENDING). In that
- * case we want to wait a bit and try again, giving up after 1 second
- * (since this condition should never persist very long). However,
- * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
- * so care is needed. In particular that happens if we try to open a
- * directory, or of course if there's an actual file-permissions
- * problem. To distinguish these cases, try a stat(). In the
- * delete-pending case, it will either also get STATUS_DELETE_PENDING,
- * or it will see the file as gone and fail with ENOENT. In other
- * cases it will usually succeed. The only somewhat-likely case where
- * this coding will uselessly wait is if there's a permissions problem
- * with a containing directory, which we hope will never happen in any
- * performance-critical code paths.
- */
- if (err == ERROR_ACCESS_DENIED)
- {
- if (loops < 10)
- {
- struct stat st;
-
- if (stat(fileName, &st) != 0)
- {
- pg_usleep(100000);
- loops++;
- continue;
- }
- }
- }
-
_dosmaperr(err);
- return -1;
+ return INVALID_HANDLE_VALUE;
}
+ return h;
+}
+
+int
+pgwin32_open(const char *fileName, int fileFlags,...)
+{
+ HANDLE h;
+ int fd;
+
+ h = pgwin32_open_handle(fileName, fileFlags, false);
+ if (h == INVALID_HANDLE_VALUE)
+ return -1;
+
+#ifdef FRONTEND
+
+ /*
+ * Since PostgreSQL 12, those concurrent-safe versions of open() and
+ * fopen() can be used by frontends, having as side-effect to switch the
+ * file-translation mode from O_TEXT to O_BINARY if none is specified.
+ * Caller may want to enforce the binary or text mode, but if nothing is
+ * defined make sure that the default mode maps with what versions older
+ * than 12 have been doing.
+ */
+ if ((fileFlags & O_BINARY) == 0)
+ fileFlags |= O_TEXT;
+#endif
+
/* _open_osfhandle will, on error, set errno accordingly */
if ((fd = _open_osfhandle((intptr_t) h, fileFlags & O_APPEND)) < 0)
CloseHandle(h); /* will not affect errno */
diff --git a/src/port/win32ntdll.c b/src/port/win32ntdll.c
new file mode 100644
index 0000000000..fc9d1b84bb
--- /dev/null
+++ b/src/port/win32ntdll.c
@@ -0,0 +1,68 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.c
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/port/win32ntdll.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include "port/win32ntdll.h"
+
+NtCreateFile_t pg_NtCreateFile;
+RtlDosPathNameToNtPathName_U_t pg_RtlDosPathNameToNtPathName_U;
+RtlNtStatusToDosError_t pg_RtlNtStatusToDosError;
+
+int
+initialize_ntdll(void)
+{
+ static bool initialized;
+ HMODULE module;
+
+ static const struct {
+ const char *name;
+ pg_funcptr_t *address;
+ } routines[] = {
+ {"NtCreateFile", (pg_funcptr_t *) &pg_NtCreateFile},
+ {"RtlDosPathNameToNtPathName_U",
+ (pg_funcptr_t *) &pg_RtlDosPathNameToNtPathName_U},
+ {"RtlNtStatusToDosError", (pg_funcptr_t *) &pg_RtlNtStatusToDosError}
+ };
+
+ if (initialized)
+ return 0;
+
+ if (!(module = LoadLibraryEx("ntdll.dll", NULL, 0)))
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ for (int i = 0; i < lengthof(routines); ++i)
+ {
+ pg_funcptr_t address;
+
+ address = (pg_funcptr_t) GetProcAddress(module, routines[i].name);
+ if (!address)
+ {
+ _dosmaperr(GetLastError());
+ FreeLibrary(module);
+
+ return -1;
+ }
+
+ *(pg_funcptr_t *) routines[i].address = address;
+ }
+
+ initialized = true;
+
+ return 0;
+}
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 2ad8ee1359..c851400dc8 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -18,53 +18,6 @@
#include "c.h"
#include <windows.h>
-/*
- * In order to support MinGW and MSVC2013 we use NtQueryInformationFile as an
- * alternative for GetFileInformationByHandleEx. It is loaded from the ntdll
- * library.
- */
-#if _WIN32_WINNT < 0x0600
-#include <winternl.h>
-
-#if !defined(__MINGW32__) && !defined(__MINGW64__)
-/* MinGW includes this in <winternl.h>, but it is missing in MSVC */
-typedef struct _FILE_STANDARD_INFORMATION
-{
- LARGE_INTEGER AllocationSize;
- LARGE_INTEGER EndOfFile;
- ULONG NumberOfLinks;
- BOOLEAN DeletePending;
- BOOLEAN Directory;
-} FILE_STANDARD_INFORMATION;
-#define FileStandardInformation 5
-#endif /* !defined(__MINGW32__) &&
- * !defined(__MINGW64__) */
-
-typedef NTSTATUS (NTAPI * PFN_NTQUERYINFORMATIONFILE)
- (IN HANDLE FileHandle,
- OUT PIO_STATUS_BLOCK IoStatusBlock,
- OUT PVOID FileInformation,
- IN ULONG Length,
- IN FILE_INFORMATION_CLASS FileInformationClass);
-
-static PFN_NTQUERYINFORMATIONFILE _NtQueryInformationFile = NULL;
-
-static HMODULE ntdll = NULL;
-
-/*
- * Load DLL file just once regardless of how many functions we load/call in it.
- */
-static void
-LoadNtdll(void)
-{
- if (ntdll != NULL)
- return;
- ntdll = LoadLibraryEx("ntdll.dll", NULL, 0);
-}
-
-#endif /* _WIN32_WINNT < 0x0600 */
-
-
/*
* Convert a FILETIME struct into a 64 bit time_t.
*/
@@ -162,120 +115,18 @@ int
_pgstat64(const char *name, struct stat *buf)
{
/*
- * We must use a handle so lstat() returns the information of the target
- * file. To have a reliable test for ERROR_DELETE_PENDING, we use
- * NtQueryInformationFile from Windows 2000 or
- * GetFileInformationByHandleEx from Server 2008 / Vista.
+ * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We
+ * request FILE_FLAG_BACKUP_SEMANTICS so that we can open directories too,
+ * for limited purposes. We use the private handle-based version, so we
+ * don't risk running out of fds.
*/
- SECURITY_ATTRIBUTES sa;
HANDLE hFile;
int ret;
-#if _WIN32_WINNT < 0x0600
- IO_STATUS_BLOCK ioStatus;
- FILE_STANDARD_INFORMATION standardInfo;
-#else
- FILE_STANDARD_INFO standardInfo;
-#endif
-
- if (name == NULL || buf == NULL)
- {
- errno = EINVAL;
- return -1;
- }
- /* fast not-exists check */
- if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
- {
- _dosmaperr(GetLastError());
- return -1;
- }
-
- /* get a file handle as lightweight as we can */
- sa.nLength = sizeof(SECURITY_ATTRIBUTES);
- sa.bInheritHandle = TRUE;
- sa.lpSecurityDescriptor = NULL;
- hFile = CreateFile(name,
- GENERIC_READ,
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- OPEN_EXISTING,
- (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
- FILE_FLAG_OVERLAPPED),
- NULL);
+ hFile = pgwin32_open_handle(name, O_RDONLY, true);
if (hFile == INVALID_HANDLE_VALUE)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
return -1;
- }
-
- memset(&standardInfo, 0, sizeof(standardInfo));
-
-#if _WIN32_WINNT < 0x0600
- if (_NtQueryInformationFile == NULL)
- {
- /* First time through: load ntdll.dll and find NtQueryInformationFile */
- LoadNtdll();
- if (ntdll == NULL)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-
- _NtQueryInformationFile = (PFN_NTQUERYINFORMATIONFILE) (pg_funcptr_t)
- GetProcAddress(ntdll, "NtQueryInformationFile");
- if (_NtQueryInformationFile == NULL)
- {
- DWORD err = GetLastError();
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
- }
-
- if (!NT_SUCCESS(_NtQueryInformationFile(hFile, &ioStatus, &standardInfo,
- sizeof(standardInfo),
- FileStandardInformation)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#else
- if (!GetFileInformationByHandleEx(hFile, FileStandardInfo, &standardInfo,
- sizeof(standardInfo)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#endif /* _WIN32_WINNT < 0x0600 */
-
- if (standardInfo.DeletePending)
- {
- /*
- * File has been deleted, but is not gone from the filesystem yet.
- * This can happen when some process with FILE_SHARE_DELETE has it
- * open, and it will be fully removed once that handle is closed.
- * Meanwhile, we can't open it, so indicate that the file just doesn't
- * exist.
- */
- CloseHandle(hFile);
- errno = ENOENT;
- return -1;
- }
-
- /* At last we can invoke fileinfo_to_stat */
ret = fileinfo_to_stat(hFile, buf);
CloseHandle(hFile);
@@ -296,11 +147,6 @@ _pgfstat64(int fileno, struct stat *buf)
return -1;
}
- /*
- * Since we already have a file handle there is no need to check for
- * ERROR_DELETE_PENDING.
- */
-
return fileinfo_to_stat(hFile, buf);
}
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 84f15f7e85..bebb0578dc 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -107,7 +107,8 @@ sub mkvcbuild
pg_strong_random.c pgcheckdir.c pgmkdirp.c pgsleep.c pgstrcasecmp.c
pqsignal.c mkdtemp.c qsort.c qsort_arg.c bsearch_arg.c quotes.c system.c
strerror.c tar.c thread.c
- win32env.c win32error.c win32security.c win32setlocale.c win32stat.c);
+ win32env.c win32error.c win32ntdll.c
+ win32security.c win32setlocale.c win32stat.c);
push(@pgportfiles, 'strtof.c') if ($vsVersion < '14.00');
--
2.30.2
On Mon, Sep 6, 2021 at 6:36 PM Michael Paquier <michael@paquier.xyz> wrote:
Having a non-invasive fix for this long-standing issue would be really
great, even if that means reducing the scope of systems where this can
be fixed.
I hope those patches fix at least the basebackup problem on all
relevant OS versions, until the better POSIX thing is everywhere (they
can't fix all related problems though, since zombie files still stop
you creating new ones with the same name or deleting the containing
directory). I didn't try to find out how far back those APIs go, but
they look ancient/fundamental and widely used by other software...
But do they qualify as non-invasive?
The last time I poked at the bear (54fb8c7d), there was a test posted
by Alexander Lakhin that was really useful in making sure that
concurrency is correctly handled when a file is unlinked:
/messages/by-id/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com
Thanks. It's a confusing topic with many inconclusive threads.
It worked with VS but not on MinGW. How does your patch react to this
test?
Thanks. Adding Alexander in CC in case he has ideas/feedback.
06.09.2021 16:04, Thomas Munro wrote:
On Mon, Sep 6, 2021 at 6:36 PM Michael Paquier <michael@paquier.xyz> wrote:
The last time I poked at the bear (54fb8c7d), there was a test posted
by Alexander Lakhin that was really useful in making sure that
concurrency is correctly handled when a file is unlinked:
/messages/by-id/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com
The new approach looks very promising. Knowing that the file is really
in the DELETE_PENDING state simplifies a lot.
I've tested the patch v2_0001_Check... with my demo tests [1]/messages/by-id/e5179494-715e-f8a3-266b-0cf52adac8f4@gmail.com and [2]/messages/by-id/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com,
and it definitely works.
[1]: /messages/by-id/e5179494-715e-f8a3-266b-0cf52adac8f4@gmail.com
/messages/by-id/e5179494-715e-f8a3-266b-0cf52adac8f4@gmail.com
[2]: /messages/by-id/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com
/messages/by-id/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com
Best regards,
Alexander
On Tue, Sep 07, 2021 at 09:00:01AM +0300, Alexander Lakhin wrote:
The new approach looks very promising. Knowing that the file is really
in the DELETE_PENDING state simplifies a lot.
I've tested the patch v2_0001_Check... with my demo tests [1] and [2],
and it definitely works.
Oho, nice. Just to be sure. You are referring to
v2-0001-Check*.patch posted here, right?
/messages/by-id/CA+hUKGKj3p+2AciBGacCf_cXE0JLCYevWHexvOpK6uL1+V-zag@mail.gmail.com
--
Michael
Hello Michael,
07.09.2021 09:05, Michael Paquier wrote:
On Tue, Sep 07, 2021 at 09:00:01AM +0300, Alexander Lakhin wrote:
The new approach looks very promising. Knowing that the file is really
in the DELETE_PENDING state simplifies a lot.
I've tested the patch v2_0001_Check... with my demo tests [1] and [2],
and it definitely works.Oho, nice. Just to be sure. You are referring to
v2-0001-Check*.patch posted here, right?
/messages/by-id/CA+hUKGKj3p+2AciBGacCf_cXE0JLCYevWHexvOpK6uL1+V-zag@mail.gmail.com
Yes, i've tested that one, on the master branch (my tests needed a minor
modification due to PostgresNode changes).
Best regards,
Alexander
On Tue, Sep 7, 2021 at 7:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
07.09.2021 09:05, Michael Paquier wrote:
On Tue, Sep 07, 2021 at 09:00:01AM +0300, Alexander Lakhin wrote:
The new approach looks very promising. Knowing that the file is really
in the DELETE_PENDING state simplifies a lot.
I've tested the patch v2_0001_Check... with my demo tests [1] and [2],
and it definitely works.
Oho, nice. Just to be sure. You are referring to
v2-0001-Check*.patch posted here, right?
/messages/by-id/CA+hUKGKj3p+2AciBGacCf_cXE0JLCYevWHexvOpK6uL1+V-zag@mail.gmail.com
Yes, i've tested that one, on the master branch (my tests needed a minor
modification due to PostgresNode changes).
Thanks very much!
Time to tidy up some loose ends. There are a couple of judgement
calls involved. Here's what Andres and I came up with in an off-list
chat. Any different suggestions?
1. I abandoned the "direct NtCreateFile()" version for now. I guess
using more and wider unstable interfaces might expose us to greater
risk of silent API/behavior changes or have subtle bugs. If we ever
have a concrete reason to believe that RtlGetLastNtStatus() is not
reliable here, we could reconsider.
2. I dropped the assertion that the signal event has been created
before the first call to the open() wrapper. Instead, I taught
pg_usleep() to fall back to plain old SleepEx() if the signal stuff
isn't up yet. Other solutions are possible of course, but it struck
me as a bad idea to place initialisation ordering constraints on very
basic facilities like open() and stat().
I should point out explicitly that with this patch, stat() benefits
from open()'s tolerance for sharing violations, as a side effect.
That is, it'll retry for a short time in the hope that whoever opened
our file without allowing sharing will soon go away. I don't know how
useful that bandaid loop really is in practice, but I don't see why
we'd want that for open() and not stat(), so this change seems good to
me on consistency grounds at the very least.
3. We fixed the warnings about macro redefinition with #define
UMDF_USING_NTSTATUS and #include <ntstatus.h> in win32_port.h. (Other
ideas considered: (1) Andres reported that it also works to move the
#include to ~12 files that need things from it, ie things that were
suppressed from windows.h by that macro and must now be had from
ntstatus.h, but the files you have to change are probably different in
back branches if we decide to do that, (2) I tried defining that macro
locally in files that need it, *before* including c.h/postgres.h, and
then locally include ntstatus.h afterwards, but that seems to violate
project style and generally seems weird.)
Another thing to point out explicitly is that I added a new file
src/port/win32ntdll.c, which is responsible for fishing out the NT
function pointers. It was useful to be able to do that in the
abandoned NtCreateFile() variant because it needed three of them and I
could reduce boiler-plate noise with a static array of function names
to loop over. In this version the array has just one element, but I'd
still rather centralise this stuff in one place and make it easy to
add any more of these that we eventually find a need for.
BTW, I also plan to help Victor get his "POSIX semantics" patch[1]/messages/by-id/a529b660-da15-5b62-21a0-9936768210fd@postgrespro.ru
into the tree (and extend it to cover more ops). That should make
these problems go away in a more complete way IIUC, but doesn't work
everywhere (not sure if we have any build farm animals where it
doesn't work, if so it might be nice to change that), so it's
complementary to this patch. (My earlier idea that that stuff would
magically start happening for free on all relevant systems some time
soon has faded.)
[1]: /messages/by-id/a529b660-da15-5b62-21a0-9936768210fd@postgrespro.ru
Attachments:
v3-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchDownload
From 6ab7d5e6b5cc6bf62513a5264641e13fc007ebc7 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 5 Sep 2021 23:49:23 +1200
Subject: [PATCH v3] Check for STATUS_DELETE_PENDING on Windows.
1. Update our open() wrapper to check for NT's STATUS_DELETE_PENDING
and translate it to appropriate errors. This is done with
RtlGetLastNtStatus(), which is dynamically loaded from ntdll. A new
file win32ntdll.c centralizes lookup of NT functions, in case we decide
to add more in the future.
2. Remove non-working code that was trying to do something similar for
stat(), and just reuse the open() wrapper code. As a side effect,
stat() also gains resilience against "sharing violation" errors.
3. Since stat() is used very early in process startup, remove the
requirement that the Win32 signal event has been created before
pgwin32_open_handle() is reached. Instead, teach pg_usleep() to fall
back to a non-interruptible sleep if reached before the signal event is
available.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
configure | 6 ++
configure.ac | 1 +
src/backend/port/win32/signal.c | 12 ++-
src/include/port.h | 1 +
src/include/port/win32_port.h | 7 ++
src/include/port/win32ntdll.h | 18 ++++
src/port/open.c | 102 ++++++++++----------
src/port/win32ntdll.c | 64 +++++++++++++
src/port/win32stat.c | 164 +-------------------------------
src/tools/msvc/Mkvcbuild.pm | 3 +-
10 files changed, 169 insertions(+), 209 deletions(-)
create mode 100644 src/include/port/win32ntdll.h
create mode 100644 src/port/win32ntdll.c
diff --git a/configure b/configure
index c550cacd5a..adfe03f3f2 100755
--- a/configure
+++ b/configure
@@ -16818,6 +16818,12 @@ esac
;;
esac
+ case " $LIBOBJS " in
+ *" win32ntdll.$ac_objext "* ) ;;
+ *) LIBOBJS="$LIBOBJS win32ntdll.$ac_objext"
+ ;;
+esac
+
case " $LIBOBJS " in
*" win32security.$ac_objext "* ) ;;
*) LIBOBJS="$LIBOBJS win32security.$ac_objext"
diff --git a/configure.ac b/configure.ac
index 2ee710102f..2819b91a8c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1922,6 +1922,7 @@ if test "$PORTNAME" = "win32"; then
AC_LIBOBJ(system)
AC_LIBOBJ(win32env)
AC_LIBOBJ(win32error)
+ AC_LIBOBJ(win32ntdll)
AC_LIBOBJ(win32security)
AC_LIBOBJ(win32setlocale)
AC_LIBOBJ(win32stat)
diff --git a/src/backend/port/win32/signal.c b/src/backend/port/win32/signal.c
index 580a517f3f..61f06a29f6 100644
--- a/src/backend/port/win32/signal.c
+++ b/src/backend/port/win32/signal.c
@@ -52,7 +52,17 @@ static BOOL WINAPI pg_console_handler(DWORD dwCtrlType);
void
pg_usleep(long microsec)
{
- Assert(pgwin32_signal_event != NULL);
+ if (unlikely(pgwin32_signal_event == NULL))
+ {
+ /*
+ * If we're reached by pgwin32_open_handle() early in startup before
+ * the signal event is set up, just fall back to a regular
+ * non-interruptible sleep.
+ */
+ SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE);
+ return;
+ }
+
if (WaitForSingleObject(pgwin32_signal_event,
(microsec < 500 ? 1 : (microsec + 500) / 1000))
== WAIT_OBJECT_0)
diff --git a/src/include/port.h b/src/include/port.h
index 82f63de325..ec64be429c 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -290,6 +290,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
#define open(a,b,c) pgwin32_open(a,b,c)
diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index 05c5a53442..f11ac5e47a 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -43,9 +43,16 @@
#define _WINSOCKAPI_
#endif
+/*
+ * Tell windows.h that we're going to include ntstatus.h, to avoid
+ * double-definition of some macros.
+ */
+ #define UMDF_USING_NTSTATUS
+
#include <winsock2.h>
#include <ws2tcpip.h>
#include <windows.h>
+#include <ntstatus.h>
#undef small
#include <process.h>
#include <signal.h>
diff --git a/src/include/port/win32ntdll.h b/src/include/port/win32ntdll.h
new file mode 100644
index 0000000000..c4d6a9c00e
--- /dev/null
+++ b/src/include/port/win32ntdll.h
@@ -0,0 +1,18 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.h
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/win32ntdll.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+typedef NTSTATUS (__stdcall *RtlGetLastNtStatus_t)(void);
+
+extern RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+extern int initialize_ntdll(void);
diff --git a/src/port/open.c b/src/port/open.c
index 14c6debba9..6c7a70c367 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -19,11 +19,12 @@
#include "postgres_fe.h"
#endif
+#include "port/win32ntdll.h"
+
#include <fcntl.h>
#include <assert.h>
#include <sys/stat.h>
-
static int
openFlagsToCreateFileFlags(int openFlags)
{
@@ -56,38 +57,25 @@ openFlagsToCreateFileFlags(int openFlags)
}
/*
- * - file attribute setting, based on fileMode?
+ * Internal function used by pgwin32_open() and _pgstat64(). When
+ * backup_semantics is true, directories may be opened (for limited uses). On
+ * failure, INVALID_HANDLE_VALUE is returned and errno is set.
*/
-int
-pgwin32_open(const char *fileName, int fileFlags,...)
+HANDLE
+pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
{
- int fd;
- HANDLE h = INVALID_HANDLE_VALUE;
+ HANDLE h;
SECURITY_ATTRIBUTES sa;
int loops = 0;
+ if (initialize_ntdll() < 0)
+ return INVALID_HANDLE_VALUE;
+
/* Check that we can handle the request */
assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
(O_RANDOM | O_SEQUENTIAL | O_TEMPORARY) |
_O_SHORT_LIVED | O_DSYNC | O_DIRECT |
(O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
-#ifndef FRONTEND
- Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
-#endif
-
-#ifdef FRONTEND
-
- /*
- * Since PostgreSQL 12, those concurrent-safe versions of open() and
- * fopen() can be used by frontends, having as side-effect to switch the
- * file-translation mode from O_TEXT to O_BINARY if none is specified.
- * Caller may want to enforce the binary or text mode, but if nothing is
- * defined make sure that the default mode maps with what versions older
- * than 12 have been doing.
- */
- if ((fileFlags & O_BINARY) == 0)
- fileFlags |= O_TEXT;
-#endif
sa.nLength = sizeof(sa);
sa.bInheritHandle = TRUE;
@@ -102,6 +90,7 @@ pgwin32_open(const char *fileName, int fileFlags,...)
&sa,
openFlagsToCreateFileFlags(fileFlags),
FILE_ATTRIBUTE_NORMAL |
+ (backup_semantics ? FILE_FLAG_BACKUP_SEMANTICS : 0) |
((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
@@ -140,38 +129,55 @@ pgwin32_open(const char *fileName, int fileFlags,...)
/*
* ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
* gone (Windows NT status code is STATUS_DELETE_PENDING). In that
- * case we want to wait a bit and try again, giving up after 1 second
- * (since this condition should never persist very long). However,
- * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
- * so care is needed. In particular that happens if we try to open a
- * directory, or of course if there's an actual file-permissions
- * problem. To distinguish these cases, try a stat(). In the
- * delete-pending case, it will either also get STATUS_DELETE_PENDING,
- * or it will see the file as gone and fail with ENOENT. In other
- * cases it will usually succeed. The only somewhat-likely case where
- * this coding will uselessly wait is if there's a permissions problem
- * with a containing directory, which we hope will never happen in any
- * performance-critical code paths.
+ * case, we'd better ask for the NT status too so we can translate it
+ * to a more Unix-like error. We hope that nothing clobbers the NT
+ * status in between the internal NtCreateFile() call and CreateFile()
+ * returning.
+ *
+ * If there's no O_CREAT flag, then we'll pretend the file is
+ * invisible. With O_CREAT, we have no choice but to report that
+ * there's a file in the way (which wouldn't happen on Unix).
*/
- if (err == ERROR_ACCESS_DENIED)
+ if (err == ERROR_ACCESS_DENIED &&
+ pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
{
- if (loops < 10)
- {
- struct stat st;
-
- if (stat(fileName, &st) != 0)
- {
- pg_usleep(100000);
- loops++;
- continue;
- }
- }
+ if (fileFlags & O_CREAT)
+ err = ERROR_FILE_EXISTS;
+ else
+ err = ERROR_FILE_NOT_FOUND;
}
_dosmaperr(err);
- return -1;
+ return INVALID_HANDLE_VALUE;
}
+ return h;
+}
+
+int
+pgwin32_open(const char *fileName, int fileFlags,...)
+{
+ HANDLE h;
+ int fd;
+
+ h = pgwin32_open_handle(fileName, fileFlags, false);
+ if (h == INVALID_HANDLE_VALUE)
+ return -1;
+
+#ifdef FRONTEND
+
+ /*
+ * Since PostgreSQL 12, those concurrent-safe versions of open() and
+ * fopen() can be used by frontends, having as side-effect to switch the
+ * file-translation mode from O_TEXT to O_BINARY if none is specified.
+ * Caller may want to enforce the binary or text mode, but if nothing is
+ * defined make sure that the default mode maps with what versions older
+ * than 12 have been doing.
+ */
+ if ((fileFlags & O_BINARY) == 0)
+ fileFlags |= O_TEXT;
+#endif
+
/* _open_osfhandle will, on error, set errno accordingly */
if ((fd = _open_osfhandle((intptr_t) h, fileFlags & O_APPEND)) < 0)
CloseHandle(h); /* will not affect errno */
diff --git a/src/port/win32ntdll.c b/src/port/win32ntdll.c
new file mode 100644
index 0000000000..b010becadf
--- /dev/null
+++ b/src/port/win32ntdll.c
@@ -0,0 +1,64 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.c
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/port/win32ntdll.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include "port/win32ntdll.h"
+
+RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+int
+initialize_ntdll(void)
+{
+ static bool initialized;
+ HMODULE module;
+
+ static const struct
+ {
+ const char *name;
+ pg_funcptr_t *address;
+ } routines[] = {
+ {"RtlGetLastNtStatus", (pg_funcptr_t *) &pg_RtlGetLastNtStatus}
+ };
+
+ if (initialized)
+ return 0;
+
+ if (!(module = LoadLibraryEx("ntdll.dll", NULL, 0)))
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ for (int i = 0; i < lengthof(routines); ++i)
+ {
+ pg_funcptr_t address;
+
+ address = (pg_funcptr_t) GetProcAddress(module, routines[i].name);
+ if (!address)
+ {
+ _dosmaperr(GetLastError());
+ FreeLibrary(module);
+
+ return -1;
+ }
+
+ *(pg_funcptr_t *) routines[i].address = address;
+ }
+
+ initialized = true;
+
+ return 0;
+}
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 2ad8ee1359..c851400dc8 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -18,53 +18,6 @@
#include "c.h"
#include <windows.h>
-/*
- * In order to support MinGW and MSVC2013 we use NtQueryInformationFile as an
- * alternative for GetFileInformationByHandleEx. It is loaded from the ntdll
- * library.
- */
-#if _WIN32_WINNT < 0x0600
-#include <winternl.h>
-
-#if !defined(__MINGW32__) && !defined(__MINGW64__)
-/* MinGW includes this in <winternl.h>, but it is missing in MSVC */
-typedef struct _FILE_STANDARD_INFORMATION
-{
- LARGE_INTEGER AllocationSize;
- LARGE_INTEGER EndOfFile;
- ULONG NumberOfLinks;
- BOOLEAN DeletePending;
- BOOLEAN Directory;
-} FILE_STANDARD_INFORMATION;
-#define FileStandardInformation 5
-#endif /* !defined(__MINGW32__) &&
- * !defined(__MINGW64__) */
-
-typedef NTSTATUS (NTAPI * PFN_NTQUERYINFORMATIONFILE)
- (IN HANDLE FileHandle,
- OUT PIO_STATUS_BLOCK IoStatusBlock,
- OUT PVOID FileInformation,
- IN ULONG Length,
- IN FILE_INFORMATION_CLASS FileInformationClass);
-
-static PFN_NTQUERYINFORMATIONFILE _NtQueryInformationFile = NULL;
-
-static HMODULE ntdll = NULL;
-
-/*
- * Load DLL file just once regardless of how many functions we load/call in it.
- */
-static void
-LoadNtdll(void)
-{
- if (ntdll != NULL)
- return;
- ntdll = LoadLibraryEx("ntdll.dll", NULL, 0);
-}
-
-#endif /* _WIN32_WINNT < 0x0600 */
-
-
/*
* Convert a FILETIME struct into a 64 bit time_t.
*/
@@ -162,120 +115,18 @@ int
_pgstat64(const char *name, struct stat *buf)
{
/*
- * We must use a handle so lstat() returns the information of the target
- * file. To have a reliable test for ERROR_DELETE_PENDING, we use
- * NtQueryInformationFile from Windows 2000 or
- * GetFileInformationByHandleEx from Server 2008 / Vista.
+ * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We
+ * request FILE_FLAG_BACKUP_SEMANTICS so that we can open directories too,
+ * for limited purposes. We use the private handle-based version, so we
+ * don't risk running out of fds.
*/
- SECURITY_ATTRIBUTES sa;
HANDLE hFile;
int ret;
-#if _WIN32_WINNT < 0x0600
- IO_STATUS_BLOCK ioStatus;
- FILE_STANDARD_INFORMATION standardInfo;
-#else
- FILE_STANDARD_INFO standardInfo;
-#endif
-
- if (name == NULL || buf == NULL)
- {
- errno = EINVAL;
- return -1;
- }
- /* fast not-exists check */
- if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
- {
- _dosmaperr(GetLastError());
- return -1;
- }
-
- /* get a file handle as lightweight as we can */
- sa.nLength = sizeof(SECURITY_ATTRIBUTES);
- sa.bInheritHandle = TRUE;
- sa.lpSecurityDescriptor = NULL;
- hFile = CreateFile(name,
- GENERIC_READ,
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- OPEN_EXISTING,
- (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
- FILE_FLAG_OVERLAPPED),
- NULL);
+ hFile = pgwin32_open_handle(name, O_RDONLY, true);
if (hFile == INVALID_HANDLE_VALUE)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
return -1;
- }
-
- memset(&standardInfo, 0, sizeof(standardInfo));
-
-#if _WIN32_WINNT < 0x0600
- if (_NtQueryInformationFile == NULL)
- {
- /* First time through: load ntdll.dll and find NtQueryInformationFile */
- LoadNtdll();
- if (ntdll == NULL)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-
- _NtQueryInformationFile = (PFN_NTQUERYINFORMATIONFILE) (pg_funcptr_t)
- GetProcAddress(ntdll, "NtQueryInformationFile");
- if (_NtQueryInformationFile == NULL)
- {
- DWORD err = GetLastError();
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
- }
-
- if (!NT_SUCCESS(_NtQueryInformationFile(hFile, &ioStatus, &standardInfo,
- sizeof(standardInfo),
- FileStandardInformation)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#else
- if (!GetFileInformationByHandleEx(hFile, FileStandardInfo, &standardInfo,
- sizeof(standardInfo)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#endif /* _WIN32_WINNT < 0x0600 */
-
- if (standardInfo.DeletePending)
- {
- /*
- * File has been deleted, but is not gone from the filesystem yet.
- * This can happen when some process with FILE_SHARE_DELETE has it
- * open, and it will be fully removed once that handle is closed.
- * Meanwhile, we can't open it, so indicate that the file just doesn't
- * exist.
- */
- CloseHandle(hFile);
- errno = ENOENT;
- return -1;
- }
-
- /* At last we can invoke fileinfo_to_stat */
ret = fileinfo_to_stat(hFile, buf);
CloseHandle(hFile);
@@ -296,11 +147,6 @@ _pgfstat64(int fileno, struct stat *buf)
return -1;
}
- /*
- * Since we already have a file handle there is no need to check for
- * ERROR_DELETE_PENDING.
- */
-
return fileinfo_to_stat(hFile, buf);
}
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 84f15f7e85..bebb0578dc 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -107,7 +107,8 @@ sub mkvcbuild
pg_strong_random.c pgcheckdir.c pgmkdirp.c pgsleep.c pgstrcasecmp.c
pqsignal.c mkdtemp.c qsort.c qsort_arg.c bsearch_arg.c quotes.c system.c
strerror.c tar.c thread.c
- win32env.c win32error.c win32security.c win32setlocale.c win32stat.c);
+ win32env.c win32error.c win32ntdll.c
+ win32security.c win32setlocale.c win32stat.c);
push(@pgportfiles, 'strtof.c') if ($vsVersion < '14.00');
--
2.30.2
On Fri, Sep 10, 2021 at 5:04 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Sep 7, 2021 at 7:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
07.09.2021 09:05, Michael Paquier wrote:
On Tue, Sep 07, 2021 at 09:00:01AM +0300, Alexander Lakhin wrote:
The new approach looks very promising. Knowing that the file is really
in the DELETE_PENDING state simplifies a lot.
I've tested the patch v2_0001_Check... with my demo tests [1] and [2],
and it definitely works.
Since our handling of that stuff never really worked the way we wanted
(or if it did, then Windows' behaviour changed, possibly well over a
decade ago, from what I could dig up), this isn't an open item
candidate for 14 after all, it's a pre-existing condition. So I
propose to push this fix to master only soon, and then let it stew
there for a little while to see how the buildfarm Windows variants and
the Windows hacker community testing on master react. If it looks
good, we can back-patch it a bit later, perhaps some more convenient
time WRT the release.
I added a CF entry to see if anyone else wants to review it and get CI.
One small detail I'd like to draw attention to is this bit, which
differs slightly from the [non-working] previous attempts by mapping
to two different errors:
+ * If there's no O_CREAT flag, then we'll pretend the file is
+ * invisible. With O_CREAT, we have no choice but to report that
+ * there's a file in the way (which wouldn't happen on Unix).
...
+ if (fileFlags & O_CREAT)
+ err = ERROR_FILE_EXISTS;
+ else
+ err = ERROR_FILE_NOT_FOUND;
On Thu, Sep 23, 2021 at 4:58 AM Thomas Munro <thomas.munro@gmail.com> wrote:
One small detail I'd like to draw attention to is this bit, which
differs slightly from the [non-working] previous attempts by mapping
to two different errors:+ * If there's no O_CREAT flag, then we'll pretend the file is + * invisible. With O_CREAT, we have no choice but to report that + * there's a file in the way (which wouldn't happen on Unix)....
+ if (fileFlags & O_CREAT) + err = ERROR_FILE_EXISTS; + else + err = ERROR_FILE_NOT_FOUND;
When GetTempFileName() finds a duplicated file name and the file is pending
for deletion, it fails with an "ERROR_ACCESS_DENIED" error code. That may
describe the situation better than "ERROR_FILE_EXISTS".
Regards,
Juan José Santamaría Flecha
On Thu, Sep 23, 2021 at 9:13 PM Juan José Santamaría Flecha
<juanjo.santamaria@gmail.com> wrote:
On Thu, Sep 23, 2021 at 4:58 AM Thomas Munro <thomas.munro@gmail.com> wrote:
One small detail I'd like to draw attention to is this bit, which
differs slightly from the [non-working] previous attempts by mapping
to two different errors:+ * If there's no O_CREAT flag, then we'll pretend the file is + * invisible. With O_CREAT, we have no choice but to report that + * there's a file in the way (which wouldn't happen on Unix)....
+ if (fileFlags & O_CREAT) + err = ERROR_FILE_EXISTS; + else + err = ERROR_FILE_NOT_FOUND;When GetTempFileName() finds a duplicated file name and the file is pending for deletion, it fails with an "ERROR_ACCESS_DENIED" error code. That may describe the situation better than "ERROR_FILE_EXISTS".
Thanks for looking. Why do you think that's better? I assume that's
just the usual NT->Win32 error conversion at work.
The only case I can think of so far in our tree where you'd notice
this change of errno for the O_CREAT case is relfilenode creation[1]/messages/by-id/CA+hUKGJz_pZTF9mckn6XgSv69+jGwdgLkxZ6b3NWGLBCVjqUZA@mail.gmail.com,
and there it's just a case of printing a different message. Trying to
create a relfilenode that exists already in delete-pending state will
fail, but with this change we'll log the %m string for EEXIST instead
of EACCES (what you see today) or ENOENT (which seems nonsensical, "I
can't create your file because it doesn't exist", and what you'd get
with this patch if I didn't have the special case for O_CREAT). So I
think that's pretty arguably an improvement.
As for how likely you are to reach that case... hmm, I don't know what
access() returns for a file in delete-pending state. The docs say
"The function returns -1 if the named file does not exist or does not
have the given mode", so perhaps it returns 0 for such a case, in
which case we'll consider it a collision and keep searching for
another free relfilenode. If that's the case, it's probably really
really unlikely you'll reach the case described in the previous
paragraph, so it probably doesn't matter much.
Do we have any other code paths where this finer point could cause
problems? Looking around at code that handles EEXIST, most of it is
directory creation (unaffected by this patch), and then
src/port/mkdtemp.c for which this change is appropriate (it implements
POSIX mkdtemp(), which shouldn't report EACCES to its caller if
something it tries bumps into a delete-pending file, it should see
EEXIST and try a new name, which I think it will do with this patch,
through its call to open(O_CREAT | O_EXCL)).
[1]: /messages/by-id/CA+hUKGJz_pZTF9mckn6XgSv69+jGwdgLkxZ6b3NWGLBCVjqUZA@mail.gmail.com
On Tue, Sep 28, 2021 at 2:50 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Sep 23, 2021 at 9:13 PM Juan José Santamaría Flecha
<juanjo.santamaria@gmail.com> wrote:When GetTempFileName() finds a duplicated file name and the file is
pending for deletion, it fails with an "ERROR_ACCESS_DENIED" error code.
That may describe the situation better than "ERROR_FILE_EXISTS".Thanks for looking. Why do you think that's better? I assume that's
just the usual NT->Win32 error conversion at work.When a function returns an error caused by accessing a file
in DELETE_PENDING you should expect an EACCES. Nonetheless, if we can
emulate a POSIX behaviour by mapping it to EEXIST, that works for me. I
also consider that having the logic for DELETE_PENDING in a single function
is an improvement.
Regards,
Juan José Santamaría Flecha
This patch doesn't compile on Windows according to Appveyor, seemingly because
of a syntax error in the new win32ntdll.h file, but the MSVC logs are hard on
the eye so it might be unrelated.
--
Daniel Gustafsson https://vmware.com/
On 3 Nov 2021, at 12:02, Daniel Gustafsson <daniel@yesql.se> wrote:
This patch doesn't compile on Windows according to Appveyor, seemingly because
of a syntax error in the new win32ntdll.h file, but the MSVC logs are hard on
the eye so it might be unrelated.
As the thread has stalled with a patch that doesn't apply, I'm marking this
patch Returned with Feedback. Please feel free to resubmit when a new patch is
ready.
--
Daniel Gustafsson https://vmware.com/
On Thu, Dec 2, 2021 at 3:11 AM Daniel Gustafsson <daniel@yesql.se> wrote:
On 3 Nov 2021, at 12:02, Daniel Gustafsson <daniel@yesql.se> wrote:
This patch doesn't compile on Windows according to Appveyor, seemingly because
of a syntax error in the new win32ntdll.h file, but the MSVC logs are hard on
the eye so it might be unrelated.As the thread has stalled with a patch that doesn't apply, I'm marking this
patch Returned with Feedback. Please feel free to resubmit when a new patch is
ready.
I think this was broken by WIN32_LEAN_AND_MEAN (and since gained a
merge conflict, but that's easy to fix). I'll try to figure out the
right system header hacks to unbreak it...
On Sat, Dec 4, 2021 at 6:18 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Dec 2, 2021 at 3:11 AM Daniel Gustafsson <daniel@yesql.se> wrote:
This patch doesn't compile on Windows according to Appveyor, seemingly because
of a syntax error in the new win32ntdll.h file, but the MSVC logs are hard on
the eye so it might be unrelated.
I think this was broken by WIN32_LEAN_AND_MEAN (and since gained a
merge conflict, but that's easy to fix). I'll try to figure out the
right system header hacks to unbreak it...
Short version: It needed <winternl.h>.
Long version: Where Unix shares headers between user space and kernel
with #ifdef _KERNEL, today I learned that Windows seems to have two
universes of headers, with some stuff defined in both places. You
can't cross the streams. I had already defined UMDF_USING_NTSTATUS,
which tells <windows.h> that you're planning to include <ntstatus.h>,
to avoid a bunch of double-definitions (the other approach I'd found
on the 'net was to #define and #undef WIN32_NO_STATUS in the right
places), but when WIN32_LEAN_AND_MEAN was added, that combination lost
the definition of NTSTATUS, which is needed by various macros like
WAIT_OBJECT_0 (it's used in casts). It's supposed to come from
<ntdef.h>, but if you include that directly you get more double
definitions of other random stuff. Eventually I learned that
<winternl.h> fixes that. No doubt this is eroding the gains made by
WIN32_LEAN_AND_MEAN, but I don't see how to avoid it until we do the
work to stop including <windows.h> in win32_port.h. Well, I do know
one way... I noticed that <bcrypt.h> just defines NTSTATUS itself if
it sees that <ntdef.h> hasn't been included (by testing its include
guard). I tried that and it worked, but it seems pretty ugly and not
something that we should be doing.
Attachments:
v4-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchDownload
From e5e70c03d29f11c3939791ae2d7ff3a94958be38 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 5 Sep 2021 23:49:23 +1200
Subject: [PATCH v4 1/3] Check for STATUS_DELETE_PENDING on Windows.
1. Update our open() wrapper to check for NT's STATUS_DELETE_PENDING
and translate it to appropriate errors. This is done with
RtlGetLastNtStatus(), which is dynamically loaded from ntdll. A new
file win32ntdll.c centralizes lookup of NT functions, in case we decide
to add more in the future.
2. Remove non-working code that was trying to do something similar for
stat(), and just reuse the open() wrapper code. As a side effect,
stat() also gains resilience against "sharing violation" errors.
3. Since stat() is used very early in process startup, remove the
requirement that the Win32 signal event has been created before
pgwin32_open_handle() is reached. Instead, teach pg_usleep() to fall
back to a non-interruptible sleep if reached before the signal event is
available.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
configure | 6 ++
configure.ac | 1 +
src/backend/port/win32/signal.c | 12 ++-
src/include/port.h | 1 +
src/include/port/win32_port.h | 8 ++
src/include/port/win32ntdll.h | 18 ++++
src/port/open.c | 102 ++++++++++----------
src/port/win32ntdll.c | 64 +++++++++++++
src/port/win32stat.c | 164 +-------------------------------
src/tools/msvc/Mkvcbuild.pm | 3 +-
10 files changed, 170 insertions(+), 209 deletions(-)
create mode 100644 src/include/port/win32ntdll.h
create mode 100644 src/port/win32ntdll.c
diff --git a/configure b/configure
index 5f842a86b2..3b19105328 100755
--- a/configure
+++ b/configure
@@ -16738,6 +16738,12 @@ esac
;;
esac
+ case " $LIBOBJS " in
+ *" win32ntdll.$ac_objext "* ) ;;
+ *) LIBOBJS="$LIBOBJS win32ntdll.$ac_objext"
+ ;;
+esac
+
case " $LIBOBJS " in
*" win32security.$ac_objext "* ) ;;
*) LIBOBJS="$LIBOBJS win32security.$ac_objext"
diff --git a/configure.ac b/configure.ac
index 566a6010dd..e77d4dcf2d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1932,6 +1932,7 @@ if test "$PORTNAME" = "win32"; then
AC_LIBOBJ(system)
AC_LIBOBJ(win32env)
AC_LIBOBJ(win32error)
+ AC_LIBOBJ(win32ntdll)
AC_LIBOBJ(win32security)
AC_LIBOBJ(win32setlocale)
AC_LIBOBJ(win32stat)
diff --git a/src/backend/port/win32/signal.c b/src/backend/port/win32/signal.c
index 580a517f3f..61f06a29f6 100644
--- a/src/backend/port/win32/signal.c
+++ b/src/backend/port/win32/signal.c
@@ -52,7 +52,17 @@ static BOOL WINAPI pg_console_handler(DWORD dwCtrlType);
void
pg_usleep(long microsec)
{
- Assert(pgwin32_signal_event != NULL);
+ if (unlikely(pgwin32_signal_event == NULL))
+ {
+ /*
+ * If we're reached by pgwin32_open_handle() early in startup before
+ * the signal event is set up, just fall back to a regular
+ * non-interruptible sleep.
+ */
+ SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE);
+ return;
+ }
+
if (WaitForSingleObject(pgwin32_signal_event,
(microsec < 500 ? 1 : (microsec + 500) / 1000))
== WAIT_OBJECT_0)
diff --git a/src/include/port.h b/src/include/port.h
index 806fb795ed..fd9c9d6f94 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -296,6 +296,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
#define open(a,b,c) pgwin32_open(a,b,c)
diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index c1c4831595..cadc34636c 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -51,9 +51,17 @@
*/
#define WIN32_LEAN_AND_MEAN
+/*
+ * Tell windows.h that we're going to include ntstatus.h, to avoid double
+ * definition errors.
+ */
+ #define UMDF_USING_NTSTATUS
+
#include <winsock2.h>
#include <ws2tcpip.h>
#include <windows.h>
+#include <ntstatus.h>
+#include <winternl.h>
#undef small
#include <process.h>
#include <signal.h>
diff --git a/src/include/port/win32ntdll.h b/src/include/port/win32ntdll.h
new file mode 100644
index 0000000000..c4d6a9c00e
--- /dev/null
+++ b/src/include/port/win32ntdll.h
@@ -0,0 +1,18 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.h
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/win32ntdll.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+typedef NTSTATUS (__stdcall *RtlGetLastNtStatus_t)(void);
+
+extern RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+extern int initialize_ntdll(void);
diff --git a/src/port/open.c b/src/port/open.c
index 14c6debba9..6c7a70c367 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -19,11 +19,12 @@
#include "postgres_fe.h"
#endif
+#include "port/win32ntdll.h"
+
#include <fcntl.h>
#include <assert.h>
#include <sys/stat.h>
-
static int
openFlagsToCreateFileFlags(int openFlags)
{
@@ -56,38 +57,25 @@ openFlagsToCreateFileFlags(int openFlags)
}
/*
- * - file attribute setting, based on fileMode?
+ * Internal function used by pgwin32_open() and _pgstat64(). When
+ * backup_semantics is true, directories may be opened (for limited uses). On
+ * failure, INVALID_HANDLE_VALUE is returned and errno is set.
*/
-int
-pgwin32_open(const char *fileName, int fileFlags,...)
+HANDLE
+pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
{
- int fd;
- HANDLE h = INVALID_HANDLE_VALUE;
+ HANDLE h;
SECURITY_ATTRIBUTES sa;
int loops = 0;
+ if (initialize_ntdll() < 0)
+ return INVALID_HANDLE_VALUE;
+
/* Check that we can handle the request */
assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
(O_RANDOM | O_SEQUENTIAL | O_TEMPORARY) |
_O_SHORT_LIVED | O_DSYNC | O_DIRECT |
(O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
-#ifndef FRONTEND
- Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
-#endif
-
-#ifdef FRONTEND
-
- /*
- * Since PostgreSQL 12, those concurrent-safe versions of open() and
- * fopen() can be used by frontends, having as side-effect to switch the
- * file-translation mode from O_TEXT to O_BINARY if none is specified.
- * Caller may want to enforce the binary or text mode, but if nothing is
- * defined make sure that the default mode maps with what versions older
- * than 12 have been doing.
- */
- if ((fileFlags & O_BINARY) == 0)
- fileFlags |= O_TEXT;
-#endif
sa.nLength = sizeof(sa);
sa.bInheritHandle = TRUE;
@@ -102,6 +90,7 @@ pgwin32_open(const char *fileName, int fileFlags,...)
&sa,
openFlagsToCreateFileFlags(fileFlags),
FILE_ATTRIBUTE_NORMAL |
+ (backup_semantics ? FILE_FLAG_BACKUP_SEMANTICS : 0) |
((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
@@ -140,38 +129,55 @@ pgwin32_open(const char *fileName, int fileFlags,...)
/*
* ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
* gone (Windows NT status code is STATUS_DELETE_PENDING). In that
- * case we want to wait a bit and try again, giving up after 1 second
- * (since this condition should never persist very long). However,
- * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
- * so care is needed. In particular that happens if we try to open a
- * directory, or of course if there's an actual file-permissions
- * problem. To distinguish these cases, try a stat(). In the
- * delete-pending case, it will either also get STATUS_DELETE_PENDING,
- * or it will see the file as gone and fail with ENOENT. In other
- * cases it will usually succeed. The only somewhat-likely case where
- * this coding will uselessly wait is if there's a permissions problem
- * with a containing directory, which we hope will never happen in any
- * performance-critical code paths.
+ * case, we'd better ask for the NT status too so we can translate it
+ * to a more Unix-like error. We hope that nothing clobbers the NT
+ * status in between the internal NtCreateFile() call and CreateFile()
+ * returning.
+ *
+ * If there's no O_CREAT flag, then we'll pretend the file is
+ * invisible. With O_CREAT, we have no choice but to report that
+ * there's a file in the way (which wouldn't happen on Unix).
*/
- if (err == ERROR_ACCESS_DENIED)
+ if (err == ERROR_ACCESS_DENIED &&
+ pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
{
- if (loops < 10)
- {
- struct stat st;
-
- if (stat(fileName, &st) != 0)
- {
- pg_usleep(100000);
- loops++;
- continue;
- }
- }
+ if (fileFlags & O_CREAT)
+ err = ERROR_FILE_EXISTS;
+ else
+ err = ERROR_FILE_NOT_FOUND;
}
_dosmaperr(err);
- return -1;
+ return INVALID_HANDLE_VALUE;
}
+ return h;
+}
+
+int
+pgwin32_open(const char *fileName, int fileFlags,...)
+{
+ HANDLE h;
+ int fd;
+
+ h = pgwin32_open_handle(fileName, fileFlags, false);
+ if (h == INVALID_HANDLE_VALUE)
+ return -1;
+
+#ifdef FRONTEND
+
+ /*
+ * Since PostgreSQL 12, those concurrent-safe versions of open() and
+ * fopen() can be used by frontends, having as side-effect to switch the
+ * file-translation mode from O_TEXT to O_BINARY if none is specified.
+ * Caller may want to enforce the binary or text mode, but if nothing is
+ * defined make sure that the default mode maps with what versions older
+ * than 12 have been doing.
+ */
+ if ((fileFlags & O_BINARY) == 0)
+ fileFlags |= O_TEXT;
+#endif
+
/* _open_osfhandle will, on error, set errno accordingly */
if ((fd = _open_osfhandle((intptr_t) h, fileFlags & O_APPEND)) < 0)
CloseHandle(h); /* will not affect errno */
diff --git a/src/port/win32ntdll.c b/src/port/win32ntdll.c
new file mode 100644
index 0000000000..b010becadf
--- /dev/null
+++ b/src/port/win32ntdll.c
@@ -0,0 +1,64 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.c
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/port/win32ntdll.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+
+#include "port/win32ntdll.h"
+
+RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+int
+initialize_ntdll(void)
+{
+ static bool initialized;
+ HMODULE module;
+
+ static const struct
+ {
+ const char *name;
+ pg_funcptr_t *address;
+ } routines[] = {
+ {"RtlGetLastNtStatus", (pg_funcptr_t *) &pg_RtlGetLastNtStatus}
+ };
+
+ if (initialized)
+ return 0;
+
+ if (!(module = LoadLibraryEx("ntdll.dll", NULL, 0)))
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ for (int i = 0; i < lengthof(routines); ++i)
+ {
+ pg_funcptr_t address;
+
+ address = (pg_funcptr_t) GetProcAddress(module, routines[i].name);
+ if (!address)
+ {
+ _dosmaperr(GetLastError());
+ FreeLibrary(module);
+
+ return -1;
+ }
+
+ *(pg_funcptr_t *) routines[i].address = address;
+ }
+
+ initialized = true;
+
+ return 0;
+}
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 426e01f0ef..29e13409e6 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -18,53 +18,6 @@
#include "c.h"
#include <windows.h>
-/*
- * In order to support MinGW and MSVC2013 we use NtQueryInformationFile as an
- * alternative for GetFileInformationByHandleEx. It is loaded from the ntdll
- * library.
- */
-#if _WIN32_WINNT < 0x0600
-#include <winternl.h>
-
-#if !defined(__MINGW32__) && !defined(__MINGW64__)
-/* MinGW includes this in <winternl.h>, but it is missing in MSVC */
-typedef struct _FILE_STANDARD_INFORMATION
-{
- LARGE_INTEGER AllocationSize;
- LARGE_INTEGER EndOfFile;
- ULONG NumberOfLinks;
- BOOLEAN DeletePending;
- BOOLEAN Directory;
-} FILE_STANDARD_INFORMATION;
-#define FileStandardInformation 5
-#endif /* !defined(__MINGW32__) &&
- * !defined(__MINGW64__) */
-
-typedef NTSTATUS (NTAPI * PFN_NTQUERYINFORMATIONFILE)
- (IN HANDLE FileHandle,
- OUT PIO_STATUS_BLOCK IoStatusBlock,
- OUT PVOID FileInformation,
- IN ULONG Length,
- IN FILE_INFORMATION_CLASS FileInformationClass);
-
-static PFN_NTQUERYINFORMATIONFILE _NtQueryInformationFile = NULL;
-
-static HMODULE ntdll = NULL;
-
-/*
- * Load DLL file just once regardless of how many functions we load/call in it.
- */
-static void
-LoadNtdll(void)
-{
- if (ntdll != NULL)
- return;
- ntdll = LoadLibraryEx("ntdll.dll", NULL, 0);
-}
-
-#endif /* _WIN32_WINNT < 0x0600 */
-
-
/*
* Convert a FILETIME struct into a 64 bit time_t.
*/
@@ -162,120 +115,18 @@ int
_pgstat64(const char *name, struct stat *buf)
{
/*
- * We must use a handle so lstat() returns the information of the target
- * file. To have a reliable test for ERROR_DELETE_PENDING, we use
- * NtQueryInformationFile from Windows 2000 or
- * GetFileInformationByHandleEx from Server 2008 / Vista.
+ * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We
+ * request FILE_FLAG_BACKUP_SEMANTICS so that we can open directories too,
+ * for limited purposes. We use the private handle-based version, so we
+ * don't risk running out of fds.
*/
- SECURITY_ATTRIBUTES sa;
HANDLE hFile;
int ret;
-#if _WIN32_WINNT < 0x0600
- IO_STATUS_BLOCK ioStatus;
- FILE_STANDARD_INFORMATION standardInfo;
-#else
- FILE_STANDARD_INFO standardInfo;
-#endif
-
- if (name == NULL || buf == NULL)
- {
- errno = EINVAL;
- return -1;
- }
- /* fast not-exists check */
- if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
- {
- _dosmaperr(GetLastError());
- return -1;
- }
-
- /* get a file handle as lightweight as we can */
- sa.nLength = sizeof(SECURITY_ATTRIBUTES);
- sa.bInheritHandle = TRUE;
- sa.lpSecurityDescriptor = NULL;
- hFile = CreateFile(name,
- GENERIC_READ,
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- OPEN_EXISTING,
- (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
- FILE_FLAG_OVERLAPPED),
- NULL);
+ hFile = pgwin32_open_handle(name, O_RDONLY, true);
if (hFile == INVALID_HANDLE_VALUE)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
return -1;
- }
-
- memset(&standardInfo, 0, sizeof(standardInfo));
-
-#if _WIN32_WINNT < 0x0600
- if (_NtQueryInformationFile == NULL)
- {
- /* First time through: load ntdll.dll and find NtQueryInformationFile */
- LoadNtdll();
- if (ntdll == NULL)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-
- _NtQueryInformationFile = (PFN_NTQUERYINFORMATIONFILE) (pg_funcptr_t)
- GetProcAddress(ntdll, "NtQueryInformationFile");
- if (_NtQueryInformationFile == NULL)
- {
- DWORD err = GetLastError();
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
- }
-
- if (!NT_SUCCESS(_NtQueryInformationFile(hFile, &ioStatus, &standardInfo,
- sizeof(standardInfo),
- FileStandardInformation)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#else
- if (!GetFileInformationByHandleEx(hFile, FileStandardInfo, &standardInfo,
- sizeof(standardInfo)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#endif /* _WIN32_WINNT < 0x0600 */
-
- if (standardInfo.DeletePending)
- {
- /*
- * File has been deleted, but is not gone from the filesystem yet.
- * This can happen when some process with FILE_SHARE_DELETE has it
- * open, and it will be fully removed once that handle is closed.
- * Meanwhile, we can't open it, so indicate that the file just doesn't
- * exist.
- */
- CloseHandle(hFile);
- errno = ENOENT;
- return -1;
- }
-
- /* At last we can invoke fileinfo_to_stat */
ret = fileinfo_to_stat(hFile, buf);
CloseHandle(hFile);
@@ -316,11 +167,6 @@ _pgfstat64(int fileno, struct stat *buf)
return 0;
}
- /*
- * Since we already have a file handle there is no need to check for
- * ERROR_DELETE_PENDING.
- */
-
return fileinfo_to_stat(hFile, buf);
}
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 5a374a4727..404c45a6f3 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -107,7 +107,8 @@ sub mkvcbuild
pg_strong_random.c pgcheckdir.c pgmkdirp.c pgsleep.c pgstrcasecmp.c
pqsignal.c mkdtemp.c qsort.c qsort_arg.c bsearch_arg.c quotes.c system.c
strerror.c tar.c thread.c
- win32env.c win32error.c win32security.c win32setlocale.c win32stat.c);
+ win32env.c win32error.c win32ntdll.c
+ win32security.c win32setlocale.c win32stat.c);
push(@pgportfiles, 'strtof.c') if ($vsVersion < '14.00');
--
2.33.1
On Mon, Dec 6, 2021 at 9:17 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Sat, Dec 4, 2021 at 6:18 PM Thomas Munro <thomas.munro@gmail.com> wrote:
I think this was broken by WIN32_LEAN_AND_MEAN (and since gained a
merge conflict, but that's easy to fix). I'll try to figure out the
right system header hacks to unbreak it...Short version: It needed <winternl.h>.
Slightly improvement: now I include <winternl.h> only from
src/port/open.c and src/port/win32ntdll.c, so I avoid the extra
include for the other ~1500 translation units. That requires a small
extra step to work, see comment in win32ntdll.h. I checked that this
still cross-compiles OK under mingw on Linux. This is the version
that I'm planning to push to master only tomorrow if there are no
objections.
Attachments:
v5-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchtext/x-patch; charset=UTF-8; name=v5-0001-Check-for-STATUS_DELETE_PENDING-on-Windows.patchDownload
From d04cee422b37b299443ea6aeec651336a5f2c85a Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 5 Sep 2021 23:49:23 +1200
Subject: [PATCH v5 1/3] Check for STATUS_DELETE_PENDING on Windows.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
1. Update our open() wrapper to check for NT's STATUS_DELETE_PENDING
and translate it to appropriate errors. This is done with
RtlGetLastNtStatus(), which is dynamically loaded from ntdll. A new
file win32ntdll.c centralizes lookup of NT functions, in case we decide
to add more in the future.
2. Remove non-working code that was trying to do something similar for
stat(), and just reuse the open() wrapper code. As a side effect,
stat() also gains resilience against "sharing violation" errors.
3. Since stat() is used very early in process startup, remove the
requirement that the Win32 signal event has been created before
pgwin32_open_handle() is reached. Instead, teach pg_usleep() to fall
back to a non-interruptible sleep if reached before the signal event is
available.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
configure | 6 ++
configure.ac | 1 +
src/backend/port/win32/signal.c | 12 ++-
src/include/port.h | 1 +
src/include/port/win32ntdll.h | 27 ++++++
src/port/open.c | 104 ++++++++++----------
src/port/win32ntdll.c | 66 +++++++++++++
src/port/win32stat.c | 164 +-------------------------------
src/tools/msvc/Mkvcbuild.pm | 3 +-
9 files changed, 175 insertions(+), 209 deletions(-)
create mode 100644 src/include/port/win32ntdll.h
create mode 100644 src/port/win32ntdll.c
diff --git a/configure b/configure
index 5f842a86b2..3b19105328 100755
--- a/configure
+++ b/configure
@@ -16738,6 +16738,12 @@ esac
;;
esac
+ case " $LIBOBJS " in
+ *" win32ntdll.$ac_objext "* ) ;;
+ *) LIBOBJS="$LIBOBJS win32ntdll.$ac_objext"
+ ;;
+esac
+
case " $LIBOBJS " in
*" win32security.$ac_objext "* ) ;;
*) LIBOBJS="$LIBOBJS win32security.$ac_objext"
diff --git a/configure.ac b/configure.ac
index 566a6010dd..e77d4dcf2d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1932,6 +1932,7 @@ if test "$PORTNAME" = "win32"; then
AC_LIBOBJ(system)
AC_LIBOBJ(win32env)
AC_LIBOBJ(win32error)
+ AC_LIBOBJ(win32ntdll)
AC_LIBOBJ(win32security)
AC_LIBOBJ(win32setlocale)
AC_LIBOBJ(win32stat)
diff --git a/src/backend/port/win32/signal.c b/src/backend/port/win32/signal.c
index 580a517f3f..61f06a29f6 100644
--- a/src/backend/port/win32/signal.c
+++ b/src/backend/port/win32/signal.c
@@ -52,7 +52,17 @@ static BOOL WINAPI pg_console_handler(DWORD dwCtrlType);
void
pg_usleep(long microsec)
{
- Assert(pgwin32_signal_event != NULL);
+ if (unlikely(pgwin32_signal_event == NULL))
+ {
+ /*
+ * If we're reached by pgwin32_open_handle() early in startup before
+ * the signal event is set up, just fall back to a regular
+ * non-interruptible sleep.
+ */
+ SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE);
+ return;
+ }
+
if (WaitForSingleObject(pgwin32_signal_event,
(microsec < 500 ? 1 : (microsec + 500) / 1000))
== WAIT_OBJECT_0)
diff --git a/src/include/port.h b/src/include/port.h
index 806fb795ed..fd9c9d6f94 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -296,6 +296,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
#define open(a,b,c) pgwin32_open(a,b,c)
diff --git a/src/include/port/win32ntdll.h b/src/include/port/win32ntdll.h
new file mode 100644
index 0000000000..4d8808b3aa
--- /dev/null
+++ b/src/include/port/win32ntdll.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.h
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/win32ntdll.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Because this includes NT headers that normally conflict with Win32 headers,
+ * any translation unit that includes it should #define UMDF_USING_NTSTATUS
+ * before including <windows.h>.
+ */
+
+#include <ntstatus.h>
+#include <winternl.h>
+
+typedef NTSTATUS (__stdcall *RtlGetLastNtStatus_t) (void);
+
+extern RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+extern int initialize_ntdll(void);
diff --git a/src/port/open.c b/src/port/open.c
index 14c6debba9..7b52bd83c2 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -13,17 +13,20 @@
#ifdef WIN32
+#define UMDF_USING_NTSTATUS
+
#ifndef FRONTEND
#include "postgres.h"
#else
#include "postgres_fe.h"
#endif
+#include "port/win32ntdll.h"
+
#include <fcntl.h>
#include <assert.h>
#include <sys/stat.h>
-
static int
openFlagsToCreateFileFlags(int openFlags)
{
@@ -56,38 +59,25 @@ openFlagsToCreateFileFlags(int openFlags)
}
/*
- * - file attribute setting, based on fileMode?
+ * Internal function used by pgwin32_open() and _pgstat64(). When
+ * backup_semantics is true, directories may be opened (for limited uses). On
+ * failure, INVALID_HANDLE_VALUE is returned and errno is set.
*/
-int
-pgwin32_open(const char *fileName, int fileFlags,...)
+HANDLE
+pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
{
- int fd;
- HANDLE h = INVALID_HANDLE_VALUE;
+ HANDLE h;
SECURITY_ATTRIBUTES sa;
int loops = 0;
+ if (initialize_ntdll() < 0)
+ return INVALID_HANDLE_VALUE;
+
/* Check that we can handle the request */
assert((fileFlags & ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
(O_RANDOM | O_SEQUENTIAL | O_TEMPORARY) |
_O_SHORT_LIVED | O_DSYNC | O_DIRECT |
(O_CREAT | O_TRUNC | O_EXCL) | (O_TEXT | O_BINARY))) == fileFlags);
-#ifndef FRONTEND
- Assert(pgwin32_signal_event != NULL); /* small chance of pg_usleep() */
-#endif
-
-#ifdef FRONTEND
-
- /*
- * Since PostgreSQL 12, those concurrent-safe versions of open() and
- * fopen() can be used by frontends, having as side-effect to switch the
- * file-translation mode from O_TEXT to O_BINARY if none is specified.
- * Caller may want to enforce the binary or text mode, but if nothing is
- * defined make sure that the default mode maps with what versions older
- * than 12 have been doing.
- */
- if ((fileFlags & O_BINARY) == 0)
- fileFlags |= O_TEXT;
-#endif
sa.nLength = sizeof(sa);
sa.bInheritHandle = TRUE;
@@ -102,6 +92,7 @@ pgwin32_open(const char *fileName, int fileFlags,...)
&sa,
openFlagsToCreateFileFlags(fileFlags),
FILE_ATTRIBUTE_NORMAL |
+ (backup_semantics ? FILE_FLAG_BACKUP_SEMANTICS : 0) |
((fileFlags & O_RANDOM) ? FILE_FLAG_RANDOM_ACCESS : 0) |
((fileFlags & O_SEQUENTIAL) ? FILE_FLAG_SEQUENTIAL_SCAN : 0) |
((fileFlags & _O_SHORT_LIVED) ? FILE_ATTRIBUTE_TEMPORARY : 0) |
@@ -140,38 +131,55 @@ pgwin32_open(const char *fileName, int fileFlags,...)
/*
* ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
* gone (Windows NT status code is STATUS_DELETE_PENDING). In that
- * case we want to wait a bit and try again, giving up after 1 second
- * (since this condition should never persist very long). However,
- * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
- * so care is needed. In particular that happens if we try to open a
- * directory, or of course if there's an actual file-permissions
- * problem. To distinguish these cases, try a stat(). In the
- * delete-pending case, it will either also get STATUS_DELETE_PENDING,
- * or it will see the file as gone and fail with ENOENT. In other
- * cases it will usually succeed. The only somewhat-likely case where
- * this coding will uselessly wait is if there's a permissions problem
- * with a containing directory, which we hope will never happen in any
- * performance-critical code paths.
+ * case, we'd better ask for the NT status too so we can translate it
+ * to a more Unix-like error. We hope that nothing clobbers the NT
+ * status in between the internal NtCreateFile() call and CreateFile()
+ * returning.
+ *
+ * If there's no O_CREAT flag, then we'll pretend the file is
+ * invisible. With O_CREAT, we have no choice but to report that
+ * there's a file in the way (which wouldn't happen on Unix).
*/
- if (err == ERROR_ACCESS_DENIED)
+ if (err == ERROR_ACCESS_DENIED &&
+ pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
{
- if (loops < 10)
- {
- struct stat st;
-
- if (stat(fileName, &st) != 0)
- {
- pg_usleep(100000);
- loops++;
- continue;
- }
- }
+ if (fileFlags & O_CREAT)
+ err = ERROR_FILE_EXISTS;
+ else
+ err = ERROR_FILE_NOT_FOUND;
}
_dosmaperr(err);
- return -1;
+ return INVALID_HANDLE_VALUE;
}
+ return h;
+}
+
+int
+pgwin32_open(const char *fileName, int fileFlags,...)
+{
+ HANDLE h;
+ int fd;
+
+ h = pgwin32_open_handle(fileName, fileFlags, false);
+ if (h == INVALID_HANDLE_VALUE)
+ return -1;
+
+#ifdef FRONTEND
+
+ /*
+ * Since PostgreSQL 12, those concurrent-safe versions of open() and
+ * fopen() can be used by frontends, having as side-effect to switch the
+ * file-translation mode from O_TEXT to O_BINARY if none is specified.
+ * Caller may want to enforce the binary or text mode, but if nothing is
+ * defined make sure that the default mode maps with what versions older
+ * than 12 have been doing.
+ */
+ if ((fileFlags & O_BINARY) == 0)
+ fileFlags |= O_TEXT;
+#endif
+
/* _open_osfhandle will, on error, set errno accordingly */
if ((fd = _open_osfhandle((intptr_t) h, fileFlags & O_APPEND)) < 0)
CloseHandle(h); /* will not affect errno */
diff --git a/src/port/win32ntdll.c b/src/port/win32ntdll.c
new file mode 100644
index 0000000000..4d60d00f7d
--- /dev/null
+++ b/src/port/win32ntdll.c
@@ -0,0 +1,66 @@
+/*-------------------------------------------------------------------------
+ *
+ * win32ntdll.c
+ * Dynamically loaded Windows NT functions.
+ *
+ * Portions Copyright (c) 2021, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/port/win32ntdll.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#define UMDF_USING_NTSTATUS
+
+#include "c.h"
+
+#include "port/win32ntdll.h"
+
+RtlGetLastNtStatus_t pg_RtlGetLastNtStatus;
+
+int
+initialize_ntdll(void)
+{
+ static bool initialized;
+ HMODULE module;
+
+ static const struct
+ {
+ const char *name;
+ pg_funcptr_t *address;
+ } routines[] = {
+ {"RtlGetLastNtStatus", (pg_funcptr_t *) &pg_RtlGetLastNtStatus}
+ };
+
+ if (initialized)
+ return 0;
+
+ if (!(module = LoadLibraryEx("ntdll.dll", NULL, 0)))
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ for (int i = 0; i < lengthof(routines); ++i)
+ {
+ pg_funcptr_t address;
+
+ address = (pg_funcptr_t) GetProcAddress(module, routines[i].name);
+ if (!address)
+ {
+ _dosmaperr(GetLastError());
+ FreeLibrary(module);
+
+ return -1;
+ }
+
+ *(pg_funcptr_t *) routines[i].address = address;
+ }
+
+ initialized = true;
+
+ return 0;
+}
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 426e01f0ef..29e13409e6 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -18,53 +18,6 @@
#include "c.h"
#include <windows.h>
-/*
- * In order to support MinGW and MSVC2013 we use NtQueryInformationFile as an
- * alternative for GetFileInformationByHandleEx. It is loaded from the ntdll
- * library.
- */
-#if _WIN32_WINNT < 0x0600
-#include <winternl.h>
-
-#if !defined(__MINGW32__) && !defined(__MINGW64__)
-/* MinGW includes this in <winternl.h>, but it is missing in MSVC */
-typedef struct _FILE_STANDARD_INFORMATION
-{
- LARGE_INTEGER AllocationSize;
- LARGE_INTEGER EndOfFile;
- ULONG NumberOfLinks;
- BOOLEAN DeletePending;
- BOOLEAN Directory;
-} FILE_STANDARD_INFORMATION;
-#define FileStandardInformation 5
-#endif /* !defined(__MINGW32__) &&
- * !defined(__MINGW64__) */
-
-typedef NTSTATUS (NTAPI * PFN_NTQUERYINFORMATIONFILE)
- (IN HANDLE FileHandle,
- OUT PIO_STATUS_BLOCK IoStatusBlock,
- OUT PVOID FileInformation,
- IN ULONG Length,
- IN FILE_INFORMATION_CLASS FileInformationClass);
-
-static PFN_NTQUERYINFORMATIONFILE _NtQueryInformationFile = NULL;
-
-static HMODULE ntdll = NULL;
-
-/*
- * Load DLL file just once regardless of how many functions we load/call in it.
- */
-static void
-LoadNtdll(void)
-{
- if (ntdll != NULL)
- return;
- ntdll = LoadLibraryEx("ntdll.dll", NULL, 0);
-}
-
-#endif /* _WIN32_WINNT < 0x0600 */
-
-
/*
* Convert a FILETIME struct into a 64 bit time_t.
*/
@@ -162,120 +115,18 @@ int
_pgstat64(const char *name, struct stat *buf)
{
/*
- * We must use a handle so lstat() returns the information of the target
- * file. To have a reliable test for ERROR_DELETE_PENDING, we use
- * NtQueryInformationFile from Windows 2000 or
- * GetFileInformationByHandleEx from Server 2008 / Vista.
+ * Our open wrapper will report STATUS_DELETE_PENDING as ENOENT. We
+ * request FILE_FLAG_BACKUP_SEMANTICS so that we can open directories too,
+ * for limited purposes. We use the private handle-based version, so we
+ * don't risk running out of fds.
*/
- SECURITY_ATTRIBUTES sa;
HANDLE hFile;
int ret;
-#if _WIN32_WINNT < 0x0600
- IO_STATUS_BLOCK ioStatus;
- FILE_STANDARD_INFORMATION standardInfo;
-#else
- FILE_STANDARD_INFO standardInfo;
-#endif
-
- if (name == NULL || buf == NULL)
- {
- errno = EINVAL;
- return -1;
- }
- /* fast not-exists check */
- if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
- {
- _dosmaperr(GetLastError());
- return -1;
- }
-
- /* get a file handle as lightweight as we can */
- sa.nLength = sizeof(SECURITY_ATTRIBUTES);
- sa.bInheritHandle = TRUE;
- sa.lpSecurityDescriptor = NULL;
- hFile = CreateFile(name,
- GENERIC_READ,
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- OPEN_EXISTING,
- (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
- FILE_FLAG_OVERLAPPED),
- NULL);
+ hFile = pgwin32_open_handle(name, O_RDONLY, true);
if (hFile == INVALID_HANDLE_VALUE)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
return -1;
- }
-
- memset(&standardInfo, 0, sizeof(standardInfo));
-
-#if _WIN32_WINNT < 0x0600
- if (_NtQueryInformationFile == NULL)
- {
- /* First time through: load ntdll.dll and find NtQueryInformationFile */
- LoadNtdll();
- if (ntdll == NULL)
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-
- _NtQueryInformationFile = (PFN_NTQUERYINFORMATIONFILE) (pg_funcptr_t)
- GetProcAddress(ntdll, "NtQueryInformationFile");
- if (_NtQueryInformationFile == NULL)
- {
- DWORD err = GetLastError();
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
- }
-
- if (!NT_SUCCESS(_NtQueryInformationFile(hFile, &ioStatus, &standardInfo,
- sizeof(standardInfo),
- FileStandardInformation)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#else
- if (!GetFileInformationByHandleEx(hFile, FileStandardInfo, &standardInfo,
- sizeof(standardInfo)))
- {
- DWORD err = GetLastError();
-
- CloseHandle(hFile);
- _dosmaperr(err);
- return -1;
- }
-#endif /* _WIN32_WINNT < 0x0600 */
-
- if (standardInfo.DeletePending)
- {
- /*
- * File has been deleted, but is not gone from the filesystem yet.
- * This can happen when some process with FILE_SHARE_DELETE has it
- * open, and it will be fully removed once that handle is closed.
- * Meanwhile, we can't open it, so indicate that the file just doesn't
- * exist.
- */
- CloseHandle(hFile);
- errno = ENOENT;
- return -1;
- }
-
- /* At last we can invoke fileinfo_to_stat */
ret = fileinfo_to_stat(hFile, buf);
CloseHandle(hFile);
@@ -316,11 +167,6 @@ _pgfstat64(int fileno, struct stat *buf)
return 0;
}
- /*
- * Since we already have a file handle there is no need to check for
- * ERROR_DELETE_PENDING.
- */
-
return fileinfo_to_stat(hFile, buf);
}
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 5a374a4727..404c45a6f3 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -107,7 +107,8 @@ sub mkvcbuild
pg_strong_random.c pgcheckdir.c pgmkdirp.c pgsleep.c pgstrcasecmp.c
pqsignal.c mkdtemp.c qsort.c qsort_arg.c bsearch_arg.c quotes.c system.c
strerror.c tar.c thread.c
- win32env.c win32error.c win32security.c win32setlocale.c win32stat.c);
+ win32env.c win32error.c win32ntdll.c
+ win32security.c win32setlocale.c win32stat.c);
push(@pgportfiles, 'strtof.c') if ($vsVersion < '14.00');
--
2.33.1
On Thu, Dec 9, 2021 at 9:16 PM Thomas Munro <thomas.munro@gmail.com> wrote:
Slightly improvement: now I include <winternl.h> only from
src/port/open.c and src/port/win32ntdll.c, so I avoid the extra
include for the other ~1500 translation units. That requires a small
extra step to work, see comment in win32ntdll.h. I checked that this
still cross-compiles OK under mingw on Linux. This is the version
that I'm planning to push to master only tomorrow if there are no
objections.
Done. I'll keep an eye on the build farm.