Understanding, testing and improving our Windows filesystem code
Hi,
As a card-carrying Unix hacker, I think it'd be great to remove the
differences between platforms if possible using newer Windows
facilities, so everything just works the way we expect. Two things
that stopped progress on that front in the past were (1) uncertainty
about OS versioning, fixed in v16 with an EOL purge, and (2)
uncertainty about what the new interfaces really do, due to lack of
good ways to test, which I'd like to address here.
My goals in this thread:
* introduce a pattern/idiom for TAP-testing of low level C code
without a database server
* demonstrate the behaviour of our filesystem porting code with full coverage
* fix reported bugs in my recent symlink changes with coverage
* understand the new "POSIX semantics" changes in Windows 10
* figure out what our policy should be on "POSIX semantics"
For context, we have a bunch of stuff under src/port to provide
POSIX-like implementations of:
open()*
fstat(), stat()*, lstat()*
link(), unlink()*, rename()*
symlink(), readlink()
opendir(), readdir(), closedir()
pg_pwrite(), pg_pread()
These call equivalent Windows system interfaces so we can mostly just
write code that assumes the whole world is a POSIX. Some of them also
deal with three special aspects of Windows file handles, which
occasionally cause trouble:
1. errno == EACCES && GetLastError() == ERROR_SHARING_VIOLATION:
This happens if you try to access a file that has been opened by
without FILE_SHARE_ flags to allow concurrent access. While our own
open() wrapper uses those flags, other programs might not. The
wrapper functions marked with an asterix above deal with this
condition by sleeping or retrying for 10 or 30 seconds, in the hope
that the external program goes away. AFAIK this problem will always
be with us.
2. errno == EACCES && GetLastNtStatus() == STATUS_DELETE_PENDING:
This happens if you try to access a directory entry that is scheduled
for asynchronous unlink, but is still present until all handles to the
file are closed. The wrapper functions above deal with this in
various different ways:
open() without O_CREAT: -> ENOENT, so we can pretend that unlink()
calls are synchronous
open() with O_CREAT: -> EEXIST, the zombie dirent wins
stat(), lstat(): -> ENOENT
unlink(), rename(): retry, same as we do for ERROR_SHARING_VIOLATION,
until timeout or asynchonous unlink completes (this may have been
unintentional due to same errno?)
3. errno == EACCESS && <not sure>: You can't MoveFileEx() on top of
a file that someone has open.
In Windows 10, a new "POSIX semantics" mode was added. Yippee!
Victor Spirin proposed[1]https://commitfest.postgresql.org/40/3347/ that we use it several commitfests ago.
Interestingly, on some systems it is already partially activated
without any change on our part. That is, on some systems, unlink()
works synchronously (when the call returns, the dirent is gone, even
if someone else has the file open, just like Unix). Sounds great, but
in testing different Windows systems I have access to using the
attached test suite I found three different sets of behaviour:
A) Using Windows unlink() and MoveFileEx() on Server 2019 (CI) I get
traditional STATUS_DELETE_PENDING problems
B) Using Windows unlink()/MoveFileEx() on Windows 10 Home (local VM)
I get mostly POSIX behaviour, except problem (3) above, which you can
see in my test suite
C) Using Windows new SetFileInformationByHandle() calls with explicit
request for POSIX semantics (this syscall is something like fcntl(), a
kitchen sink kernel interface, and is what unlink() and MoveFileEx()
and others things are built on, but if you do it yourself you can
reach more flags) I get full POSIX behaviour according to my test
suite, i.e. agreement with FreeBSD and Linux for the dirent-related
cases I've though about so far, on both of those Windows systems
It sounds like we want option C, as Victor proposed, but I'm not sure
what happens if you try to use it on a non-NTFS filesystem (does it
quietly fall back to non-POSIX semantics, or fail, or do all file
systems now support this?). I'm also not sure if we really support
running on a non-NTFS filesystem, not being a Windows user myself.
So the questions I have are:
* any thoughts on this C TAP testing system?
* has anyone got a non-EOL'd OS version where this test suite fails?
* has anyone got a relevant filesystem where this fails? which way
do ReFS and SMB go? do the new calls in 0010 just fail, and if so
with which code (ie could we add our own fallback path)?
* which filesystems do we even claim to support?
* if we switched to explicitly using POSIX-semantics like in the 0010
patch, I assume there would be nothing left in the build farm or CI
that tests the non-POSIX code paths (either in these tests or in the
real server code), and the non-POSIX support would decay into
non-working form pretty quickly
* if there are any filesystems that don't support POSIX-semantics,
would we want to either (1) get such a thing into the build farm so
it's tested or (2) de-support non-POSIX-semantics filesystems by
edict, and drop a lot of code and problems that everyone hates?
Thanks to Andres for the 0002 meson support. I have not yet written
autoconf support; I guess I'd have to do that.
You can run this with eg "meson test --suite=port -v" on any OS. The
first test result tells you whether it detected POSIX semantics or
not, which affects later testing. Unix systems are always detected as
POSIX, Windows 10+ systems are always POSIX if you apply the final
patch, but could be either depending on your Windows version if you
don't, except they still have the quirk about problem (3) above for
some reason, which is why the relevant test changes in the final
patch.
(Note: The 0010 patch fails on the CI CompilerCheck cross build, which
I think has to do with wanting _WIN32_WINNT >= 0xA02 to see some
definitions, not looked into yet, and I haven't thought much about
Cygwin, but I expect they turn on all the POSIX things under the
covers too.)
Attachments:
0001-Add-suite-of-macros-for-writing-TAP-tests-in-C.patchtext/x-patch; charset=US-ASCII; name=0001-Add-suite-of-macros-for-writing-TAP-tests-in-C.patchDownload
From 463434b4162003fd3676b2cdd99b257075ea63a4 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 11 Oct 2022 17:26:02 +1300
Subject: [PATCH 01/10] Add suite of macros for writing TAP tests in C.
Historically we had to load test modules into a PostgreSQL backend or
write an ad-hoc standalone program to test C code. Let's provide a
convenient way to write stand-alone tests for low-level C code.
This commit defines a small set of macros that spit out results in TAP
format. These can be improved and extended as required.
---
src/include/lib/pg_tap.h | 138 +++++++++++++++++++++++++++++++++++++++
1 file changed, 138 insertions(+)
create mode 100644 src/include/lib/pg_tap.h
diff --git a/src/include/lib/pg_tap.h b/src/include/lib/pg_tap.h
new file mode 100644
index 0000000000..3da46ac5d5
--- /dev/null
+++ b/src/include/lib/pg_tap.h
@@ -0,0 +1,138 @@
+/*
+ * Simple macros for writing tests in C that print results in TAP format,
+ * as consumed by "prove".
+ *
+ * Specification for the output format: https://testanything.org/
+ */
+
+#ifndef PG_TAP_H
+#define PG_TAP_H
+
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+
+/* Counters are global, so we can break our tests into multiple functions. */
+static int pg_test_count;
+static int pg_fail_count;
+static int pg_todo_count;
+
+/*
+ * Require an expression to be true. Used for set-up steps that are not
+ * reported as a test.
+ */
+#define PG_REQUIRE(expr) \
+if (!(expr)) { \
+ printf("Bail out! requirement (" #expr ") failed at %s:%d\n", \
+ __FILE__, __LINE__); \
+ exit(EXIT_FAILURE); \
+}
+
+/*
+ * Like PG_REQUIRE, but log strerror(errno) before bailing.
+ */
+#define PG_REQUIRE_SYS(expr) \
+if (!(expr)) { \
+ printf("Bail out! requirement (" #expr ") failed at %s:%d, error: %s\n", \
+ __FILE__, __LINE__, strerror(errno)); \
+ exit(EXIT_FAILURE); \
+}
+
+/*
+ * Test that an expression is true. An optional message can be provided,
+ * defaulting to the expression itself if not provided.
+ */
+#define PG_EXPECT(expr, ...) \
+do { \
+ const char *messages[] = {#expr, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ pg_test_count++; \
+ if (expr) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s (at %s:%d)%s\n", pg_test_count, \
+ message, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * Like PG_EXPECT(), but also log strerror(errno) on failure.
+ */
+#define PG_EXPECT_SYS(expr, ...) \
+do { \
+ const char *messages[] = {#expr, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ pg_test_count++; \
+ if (expr) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s (at %s:%d), error: %s%s\n", pg_test_count, \
+ message, __FILE__, __LINE__, strerror(errno), \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+
+/*
+ * Test that one int expression is equal to another, logging the values if not.
+ */
+#define PG_EXPECT_EQ(expr1, expr2, ...) \
+do { \
+ const char *messages[] = {#expr1 " == " #expr2, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ int expr1_val = (expr1); \
+ int expr2_val = (expr2); \
+ pg_test_count++; \
+ if (expr1_val == expr2_val) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s: %d != %d (at %s:%d)%s\n", pg_test_count, \
+ message, expr1_val, expr2_val, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * Test that one C string expression is equal to another, logging the values if
+ * not.
+ */
+#define PG_EXPECT_EQ_STR(expr1, expr2, ...) \
+do { \
+ const char *messages[] = {#expr1 " matches " #expr2, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ const char *expr1_val = (expr1); \
+ const char *expr2_val = (expr2); \
+ pg_test_count++; \
+ if (strcmp(expr1_val, expr2_val) == 0) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s: \"%s\" vs \"%s\" (at %s:%d) %s\n", \
+ pg_test_count, \
+ message, expr1_val, expr2_val, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * The main function of a test program should begin and end with these
+ * functions.
+ */
+#define PG_BEGIN_TESTS() \
+ setbuf(stdout, NULL); /* disable buffering for Windows */
+
+#define PG_END_TESTS() printf("1..%d\n", pg_test_count); \
+ return EXIT_SUCCESS
+
+#define PG_BEGIN_TODO() pg_todo_count++
+#define PG_END_TODO() pg_todo_count--
+
+#endif
--
2.37.3
0002-meson-Add-infrastructure-for-TAP-tests-written-in-C.patchtext/x-patch; charset=US-ASCII; name=0002-meson-Add-infrastructure-for-TAP-tests-written-in-C.patchDownload
From a6227695f0e8c2effb5b62590e02726e2f8cae19 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 11 Oct 2022 10:49:15 -0700
Subject: [PATCH 02/10] meson: Add infrastructure for TAP tests written in C.
Allow t/XXX.c files to be used as t/XXX.pl files normally are.
Author: Andres Freund <andres@anarazel.de>
---
meson.build | 43 +++++++++++++++++++++++----
src/interfaces/libpq/test/meson.build | 8 ++---
src/tools/testwrap | 11 +++++--
3 files changed, 48 insertions(+), 14 deletions(-)
diff --git a/meson.build b/meson.build
index 2d225f706d..ada774da84 100644
--- a/meson.build
+++ b/meson.build
@@ -2560,6 +2560,10 @@ default_bin_args = default_target_args + {
'install_rpath': ':'.join(bin_install_rpaths),
}
+test_bin_args = default_target_args + {
+ 'install': false,
+}
+
# Helper for exporting a limited number of symbols
@@ -2919,7 +2923,14 @@ foreach test_dir : tests
endif
t = test_dir[kind]
+ env = test_env
+ # Add environment vars from the test specification
+ foreach name, value : t.get('env', {})
+ env.set(name, value)
+ endforeach
+
+ # pg_regress style tests
if kind in ['regress', 'isolation', 'ecpg']
if kind == 'regress'
runner = pg_regress
@@ -2953,7 +2964,6 @@ foreach test_dir : tests
test_command += t.get('sql', [])
endif
- env = test_env
env.prepend('PATH', temp_install_bindir, test_dir['bd'])
test_kwargs = {
@@ -2974,6 +2984,8 @@ foreach test_dir : tests
)
testport += 1
+
+ # perl tap tests
elif kind == 'tap'
if not tap_tests_enabled
continue
@@ -2987,13 +2999,8 @@ foreach test_dir : tests
# Add temporary install, the build directory for non-installed binaries and
# also test/ for non-installed test binaries built separately.
- env = test_env
env.prepend('PATH', temp_install_bindir, test_dir['bd'], test_dir['bd'] / 'test')
- foreach name, value : t.get('env', {})
- env.set(name, value)
- endforeach
-
test_kwargs = {
'protocol': 'tap',
'suite': [test_dir['name']],
@@ -3022,6 +3029,30 @@ foreach test_dir : tests
],
)
endforeach
+
+ # tap tests using C executables
+ elif kind == 'exec_tap'
+ # Don't have a dependency on perl tap test infrastructure, so we can
+ # test even when not tap_test_enabled.
+ test_kwargs = {
+ 'protocol': 'tap',
+ 'suite': [test_dir['name']],
+ 'timeout': 1000,
+ 'env': env,
+ } + t.get('test_kwargs', {})
+
+ foreach onetap : t['tests']
+ test_name = onetap.name()
+ test(test_dir['name'] / test_name,
+ python,
+ kwargs: test_kwargs,
+ depends: test_deps + t.get('deps', []) + [onetap],
+ args: testwrap_base + [
+ '--testname', test_name,
+ '--', onetap.full_path(),
+ ],
+ )
+ endforeach
else
error('unknown kind @0@ of test in @1@'.format(kind, test_dir['sd']))
endif
diff --git a/src/interfaces/libpq/test/meson.build b/src/interfaces/libpq/test/meson.build
index 017f729d43..06e026e400 100644
--- a/src/interfaces/libpq/test/meson.build
+++ b/src/interfaces/libpq/test/meson.build
@@ -11,9 +11,7 @@ endif
executable('libpq_uri_regress',
libpq_uri_regress_sources,
dependencies: [frontend_code, libpq],
- kwargs: default_bin_args + {
- 'install': false,
- }
+ kwargs: test_bin_args,
)
@@ -30,7 +28,5 @@ endif
executable('libpq_testclient',
libpq_testclient_sources,
dependencies: [frontend_code, libpq],
- kwargs: default_bin_args + {
- 'install': false,
- }
+ kwargs: default_bin_args,
)
diff --git a/src/tools/testwrap b/src/tools/testwrap
index 7a64fe76a2..5ca49b1fac 100755
--- a/src/tools/testwrap
+++ b/src/tools/testwrap
@@ -32,9 +32,16 @@ os.chdir(args.srcdir)
# mark test as having started
open(os.path.join(testdir, 'test.start'), 'x')
+testdatadir = os.path.join(testdir, 'data')
+testlogdir = os.path.join(testdir, 'log')
+
+# pre-create dirs so that the test's infrastructure doesn't have to
+os.makedirs(testdatadir)
+os.makedirs(testlogdir)
+
env_dict = {**os.environ,
- 'TESTDATADIR': os.path.join(testdir, 'data'),
- 'TESTLOGDIR': os.path.join(testdir, 'log')}
+ 'TESTDATADIR': testdatadir,
+ 'TESTLOGDIR': testlogdir}
sp = subprocess.run(args.test_command, env=env_dict)
--
2.37.3
0003-Fix-symlink-errno-in-Windows-replacement-code.patchtext/x-patch; charset=US-ASCII; name=0003-Fix-symlink-errno-in-Windows-replacement-code.patchDownload
From 309577d0c40a12dfe2c3a19cca322f69a81bce45 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Wed, 12 Oct 2022 16:34:09 +1300
Subject: [PATCH 03/10] Fix symlink() errno in Windows replacement code.
Ancient bug noticed while adding tests for these functions.
---
src/port/dirmod.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index ae6301dd6c..51c9bded8f 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -197,7 +197,10 @@ pgsymlink(const char *oldpath, const char *newpath)
FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0);
if (dirhandle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
return -1;
+ }
/* make sure we have an unparsed native win32 path */
if (memcmp("\\??\\", oldpath, 4) != 0)
@@ -230,8 +233,11 @@ pgsymlink(const char *oldpath, const char *newpath)
0, 0, &len, 0))
{
LPSTR msg;
+ int save_errno;
+
+ _dosmaperr(GetLastError());
+ save_errno = errno;
- errno = 0;
FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_IGNORE_INSERTS |
FORMAT_MESSAGE_FROM_SYSTEM,
@@ -251,6 +257,9 @@ pgsymlink(const char *oldpath, const char *newpath)
CloseHandle(dirhandle);
RemoveDirectory(newpath);
+
+ errno = save_errno;
+
return -1;
}
--
2.37.3
0004-Fix-readlink-return-value-on-Windows.patchtext/x-patch; charset=US-ASCII; name=0004-Fix-readlink-return-value-on-Windows.patchDownload
From eda707444682c138f7b182c94198ae04bb8b84bd Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Wed, 12 Oct 2022 17:22:53 +1300
Subject: [PATCH 04/10] Fix readlink() return value on Windows.
It accidentally returned a value that counted the trailing nul
terminator.
This is an ancient bug, but it's benign.
---
src/port/dirmod.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index 51c9bded8f..5881024929 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -359,6 +359,9 @@ pgreadlink(const char *path, char *buf, size_t size)
return -1;
}
+ /* r includes the nul terminator */
+ r -= 1;
+
/*
* If the path starts with "\??\", which it will do in most (all?) cases,
* strip those out.
--
2.37.3
0005-Add-tests-for-Windows-filesystem-code-in-src-port.patchtext/x-patch; charset=US-ASCII; name=0005-Add-tests-for-Windows-filesystem-code-in-src-port.patchDownload
From ae795596e02fe4dd8a52dfc11bd026f4b1a5e343 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 11 Oct 2022 17:26:02 +1300
Subject: [PATCH 05/10] Add tests for Windows filesystem code in src/port.
The wrappers we use for open(), unlink() etc had no direct testing.
These tests demonstrate the behavior of our Windows porting layer, with
two variants of underlying Windows behavior that are known to exist in
current versions of the operating system.
Since some of the functions contain internal sleep/retry loops, we need
a way to guarantee deterministic tests and perform asynchronous tasks,
so we also make the loop counts adjustable at runtime (only for use by
tests) and make pg_usleep() interruptable by Windows APC.
---
meson.build | 1 +
src/include/port.h | 2 +
src/port/dirmod.c | 7 +-
src/port/open.c | 5 +-
src/port/pgsleep.c | 2 +-
src/port/t/001_filesystem.c | 828 ++++++++++++++++++++++++++++++++++++
src/port/t/meson.build | 15 +
7 files changed, 856 insertions(+), 4 deletions(-)
create mode 100644 src/port/t/001_filesystem.c
create mode 100644 src/port/t/meson.build
diff --git a/meson.build b/meson.build
index ada774da84..ca9b3ac43c 100644
--- a/meson.build
+++ b/meson.build
@@ -2777,6 +2777,7 @@ subdir('src')
subdir('contrib')
subdir('src/test')
+subdir('src/port/t')
subdir('src/interfaces/libpq/test')
subdir('src/interfaces/ecpg/test')
diff --git a/src/include/port.h b/src/include/port.h
index 69d8818d61..f81f8de2a3 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -269,6 +269,7 @@ extern int pclose_check(FILE *stream);
/*
* Win32 doesn't have reliable rename/unlink during concurrent access.
*/
+extern PGDLLEXPORT int pgwin32_dirmod_loops;
extern int pgrename(const char *from, const char *to);
extern int pgunlink(const char *path);
@@ -307,6 +308,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern PGDLLEXPORT int pgwin32_open_handle_loops;
extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index 5881024929..3e1c78fae6 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -41,6 +41,9 @@
#if defined(WIN32) || defined(__CYGWIN__)
+/* Externally visable only to allow testing. */
+int pgwin32_dirmod_loops = 100;
+
/*
* pgrename
*/
@@ -84,7 +87,7 @@ pgrename(const char *from, const char *to)
return -1;
#endif
- if (++loops > 100) /* time out after 10 sec */
+ if (++loops > pgwin32_dirmod_loops) /* time out after 10 sec */
return -1;
pg_usleep(100000); /* us */
}
@@ -137,7 +140,7 @@ pgunlink(const char *path)
{
if (errno != EACCES)
return -1;
- if (++loops > 100) /* time out after 10 sec */
+ if (++loops > pgwin32_dirmod_loops) /* time out after 10 sec */
return -1;
pg_usleep(100000); /* us */
}
diff --git a/src/port/open.c b/src/port/open.c
index fd4faf604e..279a0ac08d 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -25,6 +25,9 @@
#include <assert.h>
#include <sys/stat.h>
+/* This is exported only to support testing. */
+int pgwin32_open_handle_loops = 300;
+
static int
openFlagsToCreateFileFlags(int openFlags)
{
@@ -118,7 +121,7 @@ pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
errhint("You might have antivirus, backup, or similar software interfering with the database system.")));
#endif
- if (loops < 300)
+ if (loops < pgwin32_open_handle_loops)
{
pg_usleep(100000);
loops++;
diff --git a/src/port/pgsleep.c b/src/port/pgsleep.c
index 03f0fac07b..9a37975c5d 100644
--- a/src/port/pgsleep.c
+++ b/src/port/pgsleep.c
@@ -53,7 +53,7 @@ pg_usleep(long microsec)
delay.tv_usec = microsec % 1000000L;
(void) select(0, NULL, NULL, NULL, &delay);
#else
- SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE);
+ SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), TRUE);
#endif
}
}
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
new file mode 100644
index 0000000000..c2832b0a20
--- /dev/null
+++ b/src/port/t/001_filesystem.c
@@ -0,0 +1,828 @@
+/*
+ * Tests for our partial implementations of POSIX-style filesystem APIs, for
+ * Windows.
+ */
+
+#include "c.h"
+
+#include <dirent.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/stat.h>
+
+#include "lib/pg_tap.h"
+#include "port/pg_iovec.h"
+
+/*
+ * Make a path under the tmp_check directory. TESTDATADIR is expected to
+ * contain an absolute path.
+ */
+static void
+make_path(char *buffer, const char *name)
+{
+ const char *directory;
+
+ directory = getenv("TESTDATADIR");
+ PG_REQUIRE(directory);
+
+ snprintf(buffer, MAXPGPATH, "%s/%s", directory, name);
+}
+
+/*
+ * Check if a directory contains a given entry, according to readdir().
+ */
+static bool
+directory_contains(const char *directory_path, const char *name)
+{
+ char directory_full_path[MAXPGPATH];
+ DIR *dir;
+ struct dirent *de;
+ bool saw_name = false;
+
+ make_path(directory_full_path, directory_path);
+ PG_REQUIRE_SYS((dir = opendir(directory_full_path)));
+ errno = 0;
+ while ((de = readdir(dir)))
+ {
+ if (strcmp(de->d_name, ".") == 0 ||
+ strcmp(de->d_name, "..") == 0)
+ continue;
+ if (strcmp(de->d_name, name) == 0)
+ saw_name = true;
+ }
+ PG_REQUIRE_SYS(errno == 0);
+ PG_REQUIRE_SYS(closedir(dir) == 0);
+
+ return saw_name;
+}
+
+#ifdef WIN32
+/*
+ * Make all slashes lean one way, to normalize paths for comparisons on Windows.
+ */
+static void
+normalize_slashes(char *path)
+{
+ while (*path)
+ {
+ if (*path == '/')
+ *path = '\\';
+ path++;
+ }
+}
+
+typedef struct apc_trampoline_data
+{
+ HANDLE timer;
+ void (*function) (void *);
+ void *data;
+} apc_trampoline_data;
+
+static void CALLBACK
+apc_trampoline(void *data, DWORD low, DWORD high)
+{
+ apc_trampoline_data *x = data;
+
+ x->function(x->data);
+ CloseHandle(x->timer);
+ free(data);
+}
+
+/*
+ * Schedule a Windows APC callback. The pg_usleep() call inside the
+ * sleep/retry loops in the wrappers under test is interruptible by APCs,
+ * causing it to run the procedure and return early.
+ */
+static void
+run_async_procedure_after_delay(void (*p) (void *), void *data, int milliseconds)
+{
+ LARGE_INTEGER delay_time = {.LowPart = -10 * milliseconds};
+ apc_trampoline_data *x;
+ HANDLE timer;
+
+ timer = CreateWaitableTimer(NULL, false, NULL);
+ PG_REQUIRE(timer);
+
+ x = malloc(sizeof(*x));
+ PG_REQUIRE_SYS(x);
+ x->timer = timer;
+ x->function = p;
+ x->data = data;
+
+ PG_REQUIRE(SetWaitableTimer(timer, &delay_time, 0, apc_trampoline, x, false));
+}
+
+static void
+close_fd(void *data)
+{
+ int *fd = data;
+
+ PG_REQUIRE_SYS(close(*fd) == 0);
+}
+
+static void
+close_handle(void *data)
+{
+ HANDLE handle = data;
+
+ PG_REQUIRE(CloseHandle(handle));
+}
+#endif
+
+/*
+ * Tests of directory entry manipulation.
+ */
+static void
+filesystem_metadata_tests(void)
+{
+ int fd;
+ int fd2;
+ char path[MAXPGPATH];
+ char path2[MAXPGPATH];
+ char path3[MAXPGPATH];
+ struct stat statbuf;
+ ssize_t size;
+ bool have_posix_unlink_semantics;
+#ifdef WIN32
+ HANDLE handle;
+#endif
+
+ /*
+ * Current versions of Windows 10 have "POSIX semantics" when unlinking
+ * files, meaning that unlink() and rename() remove the directory entry
+ * synchronously, just like Unix. At least some server OSes still seem to
+ * have the traditional Windows behavior, where directory entries remain
+ * in STATUS_DELETE_PENDING state, visible but unopenable, until all file
+ * handles are closed. We have a lot of code paths to deal with the older
+ * asynchronous behavior, and tests for those here. Before we go further,
+ * determine which behavior to expect. Behavior may also vary on non-NTFS
+ * filesystems.
+ */
+ make_path(path, "test.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ PG_REQUIRE_SYS(unlink(path) == 0);
+ fd2 = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd2 >= 0 || errno == EEXIST);
+ if (fd2 >= 0)
+ {
+ /* Expected behavior on Unix and some Windows 10 releases. */
+ PG_EXPECT_SYS(unlink(path) == 0, "POSIX unlink semantics detected: cleaning up");
+ have_posix_unlink_semantics = true;
+ PG_REQUIRE_SYS(close(fd2) == 0);
+ }
+ else
+ {
+ /* Our open wrapper fails with EEXIST for O_EXCL in this case. */
+ PG_EXPECT_SYS(errno == EEXIST,
+ "traditional Windows unlink semantics detected: O_EXCL -> EEXIST");
+ have_posix_unlink_semantics = false;
+ }
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Set up test directory structure. */
+
+ make_path(path, "dir1");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir1/my_subdir");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir1/test.txt");
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ PG_REQUIRE_SYS(write(fd, "hello world\n", 12) == 12);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Tests for symlink()/readlink(). */
+
+ make_path(path, "dir999/my_symlink"); /* name of symlink to create */
+ make_path(path2, "dir1/my_subdir"); /* name of directory it will point to */
+ PG_EXPECT(symlink(path2, path) == -1, "symlink fails on missing parent");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/my_symlink"); /* name of symlink to create */
+ make_path(path2, "dir1/my_subdir"); /* name of directory it will point to */
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "create symlink");
+
+ size = readlink(path, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, strlen(path2), "readlink reports expected size");
+ path3[size] = '\0';
+#ifdef WIN32
+ normalize_slashes(path2);
+ normalize_slashes(path3);
+#endif
+ PG_EXPECT_EQ_STR(path2, path3, "readlink reports expected target");
+
+ PG_EXPECT(readlink("does-not-exist", path3, sizeof(path3)) == -1, "readlink fails on missing path");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ /* Tests for fstat(). */
+
+ make_path(path, "dir1/test.txt");
+ fd = open(path, O_RDWR, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(fstat(fd, &statbuf) == 0, "fstat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Tests for stat(). */
+
+ PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "stat missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ make_path(path, "dir1/my_subdir");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat directory");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ make_path(path, "dir1/my_symlink");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat symlink");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ /* Tests for lstat(). */
+
+ PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "lstat missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ make_path(path2, "dir1/my_subdir");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path2, &statbuf) == 0, "lstat directory");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ make_path(path, "dir1/my_symlink");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat symlink");
+ PG_EXPECT(S_ISLNK(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, strlen(path2), "got expected symlink size");
+
+ /* Tests for link() and unlink(). */
+
+ make_path(path, "does-not-exist-1");
+ make_path(path2, "does-not-exist-2");
+ PG_EXPECT(link(path, path2) == -1, "link missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ make_path(path2, "dir1/test2.txt");
+ PG_EXPECT_SYS(link(path, path2) == 0, "link succeeds");
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 1 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 2);
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 2 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 2);
+
+ PG_EXPECT_SYS(unlink(path2) == 0, "unlink succeeds");
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 1 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ PG_EXPECT_SYS(lstat(path2, &statbuf) == -1, "lstat link 2 fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ /*
+ * On Windows we have a special code-path to make unlink() work on
+ * junction points created by our symlink() wrapper, to match POSIX
+ * behavior.
+ */
+ make_path(path, "unlink_me_symlink");
+ make_path(path2, "dir1");
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "create symlink");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink symlink");
+
+#ifdef WIN32
+
+ /*
+ * But make sure that it doesn't work on plain old directories.
+ *
+ * POSIX doesn't specify whether unlink() works on a directory, so we
+ * don't check that on non-Windows. In practice almost all known systems
+ * fail with EPERM (POSIX systems) or EISDIR (non-standard error on
+ * Linux), though AIX/JFS1 is rumored to succeed. However, our Windows
+ * emulation doesn't allow it, because we want to avoid surprises by
+ * behaving like nearly all Unix systems. So we check this on Windows
+ * only, where it fails with non-standard EACCES.
+ */
+ PG_EXPECT_SYS(unlink(path2) == -1, "Windows: can't unlink() a directory");
+ PG_EXPECT_EQ(errno, EACCES);
+#endif
+
+#ifdef WIN32
+
+ /*
+ * Test that we automatically retry for a while if we get a sharing
+ * violation because external software does not use FILE_SHARE_* when
+ * opening our files. See similar open() test for explanation.
+ */
+
+ make_path(path, "name1.txt");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(unlink(path) == -1, "Windows: can't unlink file while non-shared handle exists");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(unlink(path) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+#endif
+
+ /* Tests for rename(). */
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt");
+
+ PG_EXPECT_EQ(open(path, O_RDWR | PG_BINARY, 0777), -1, "can't open name1.txt anymore");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt, replacing it");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ fd = open(path2, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open name2.txt");
+ make_path(path2, "name2.txt");
+#ifdef WIN32
+
+ /*
+ * Windows can't rename over an open non-unlinked file, even with
+ * have_posix_unlink_semantics.
+ */
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows: can't rename name1.txt -> name2.txt while name2.txt is open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
+#else
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+#endif
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ /* leave open */
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "can rename name1.txt -> name2.txt while name1.txt is open");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ fd = open(path2, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open name2.txt");
+ PG_EXPECT_SYS(unlink(path2) == 0, "unlink name2.txt while it is still open");
+
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: rename name1.txt -> name2.txt while unlinked file is still open");
+ }
+ else
+ {
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows non-POSIX: cannot rename name1.txt -> name2.txt while unlinked file is still open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_REQUIRE_SYS(unlink(path) == 0);
+ }
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+#ifdef WIN32
+
+ /*
+ * Test that we automatically retry for a while if we get a sharing
+ * violation because external software does not use FILE_SHARE_* when
+ * opening our files. See similar open() test for explanation.
+ */
+
+ make_path(path, "name1.txt");
+ make_path(path2, "name2.txt");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(rename(path, path2) == -1, "Windows: can't rename file while non-shared handle exists for name1");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(rename(path, path2) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ handle = CreateFile(path2, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(rename(path, path2) == -1, "Windows: can't rename file while non-shared handle exists for name2");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(rename(path, path2) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+
+ PG_REQUIRE_SYS(unlink(path2) == 0);
+#endif
+
+ /* Tests for open(). */
+
+ make_path(path, "test.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open(O_CREAT | O_EXCL) succeeds");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_EQ(fd, -1, "open(O_CREAT | O_EXCL) again fails");
+ PG_EXPECT_EQ(errno, EEXIST);
+
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open(O_CREAT) succeeds");
+
+ /*
+ * We can unlink a file that we still have an open handle for on Windows,
+ * because our open() wrapper used FILE_SHARE_DELETE.
+ */
+ PG_EXPECT_SYS(unlink(path) == 0);
+
+ /*
+ * On older Windows, our open wrapper on Windows will see
+ * STATUS_DELETE_PENDING and pretend it can't see the file. On newer
+ * Windows, it won't see the file. Either way matches expected POSIX
+ * behavior.
+ */
+ PG_EXPECT(open(path, O_RDWR | PG_BINARY, 0777) == -1);
+ PG_EXPECT(errno == ENOENT);
+
+ /*
+ * Things are trickier if you asked to create a file. ENOENT doesn't make
+ * sense as a error number to a program expecting POSIX behavior then, so
+ * our wrpaper uses EEXIST for lack of anything more appropriate.
+ */
+ fd2 = open(path, O_RDWR | PG_BINARY | O_CREAT, 0777);
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_SYS(fd2 >= 0, "POSIX: create file with same name, while unlinked file is still open");
+ PG_EXPECT_SYS(close(fd2) == 0);
+ }
+ else
+ {
+ PG_EXPECT_SYS(fd2 == -1,
+ "Windows non-POSIX: cannot create file with same name, while unlinked file is still open");
+ PG_EXPECT_EQ(errno, EEXIST);
+ }
+
+ /* After closing the handle, Windows non-POSIX can now create a new file. */
+ PG_EXPECT_SYS(close(fd) == 0);
+ fd = open(path, O_RDWR | PG_BINARY | O_CREAT, 0777);
+ PG_EXPECT_SYS(fd >= 0,
+ "can create a file with recently unlinked name after closing");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+#ifdef WIN32
+
+ /*
+ * Even on systems new enough to have POSIX unlink behavior, the problem
+ * of sharing violations still applies.
+ *
+ * Our own open() wrapper is careful to use the FILE_SHARE_* flags to
+ * avoid the dreaded ERROR_SHARING_VIOLATION, but it's possible that
+ * another program might open one of our files without them. In that
+ * case, our open() wrapper will sleep and retry for a long time, in the
+ * hope that other program goes away. So let's open a handle with
+ * antisocial flags. We'll adjust the loop count via a side-hatch so we
+ * don't waste time in this test, though.
+ */
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_open_handle_loops = 2; /* minimize retries so test is fast */
+ PG_EXPECT(open(path, O_RDWR | PG_BINARY, 0777) == -1, "Windows: can't open file while non-shared handle exists");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_open_handle_loops = 18000; /* 180s must be long enough for our
+ * async callback to run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "Windows: can open file after non-shared handle asynchronously closed");
+ PG_REQUIRE_SYS(close(fd) == 0);
+#endif
+
+ PG_EXPECT_SYS(unlink(path) == 0);
+
+ /* Tests for opendir(), readdir(), closedir(). */
+ {
+ DIR *dir;
+ struct dirent *de;
+ int dot = -1;
+ int dotdot = -1;
+ int my_subdir = -1;
+ int my_symlink = -1;
+ int test_txt = -1;
+
+ make_path(path, "does-not-exist");
+ PG_EXPECT(opendir(path) == NULL, "open missing directory fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1");
+ PG_EXPECT_SYS((dir = opendir(path)), "open directory");
+
+#ifdef DT_REG
+/*
+ * Linux, *BSD, macOS and our Windows wrappers have BSD d_type. On a few rare
+ * file systems, it may be reported as DT_UNKNOWN so we have to tolerate that too.
+ */
+#define LOAD(name, variable) if (strcmp(de->d_name, name) == 0) variable = de->d_type
+#define CHECK(name, variable, type) \
+ PG_EXPECT(variable != -1, name " was found"); \
+ PG_EXPECT(variable == DT_UNKNOWN || variable == type, name " has type DT_UNKNOWN or " #type)
+#else
+/*
+ * Solaris, AIX and Cygwin do not have it (it's not in POSIX). Just check that
+ * we saw the file and ignore the type.
+ */
+#define LOAD(name, variable) if (strcmp(de->d_name, name) == 0) variable = 0
+#define CHECK(name, variable, type) \
+ PG_EXPECT(variable != -1, name " was found")
+#endif
+
+ /* Load and check in two phases because the order is unknown. */
+ errno = 0;
+ while ((de = readdir(dir)))
+ {
+ LOAD(".", dot);
+ LOAD("..", dotdot);
+ LOAD("my_subdir", my_subdir);
+ LOAD("my_symlink", my_symlink);
+ LOAD("test.txt", test_txt);
+ }
+ PG_EXPECT_SYS(errno == 0, "ran out of dirents without error");
+
+ CHECK(".", dot, DT_DIR);
+ CHECK("..", dotdot, DT_DIR);
+ CHECK("my_subdir", my_subdir, DT_DIR);
+ CHECK("my_symlink", my_symlink, DT_LNK);
+ CHECK("test.txt", test_txt, DT_REG);
+
+#undef LOAD
+#undef CHECK
+ }
+
+ /*
+ * On older Windows, readdir() sees entries that are in
+ * STATUS_DELETE_PENDING state, and they prevent the directory itself from
+ * being unlinked. Demonstrate that difference here.
+ */
+ make_path(path, "dir2");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir2/test.txt");
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink file while still open");
+ if (have_posix_unlink_semantics)
+ PG_EXPECT(!directory_contains("dir2", "test.txt"), "POSIX: readdir doesn't see it before closing");
+ else
+ PG_EXPECT(directory_contains("dir2", "test.txt"), "Windows non-POSIX: readdir still sees it before closing");
+ PG_EXPECT_SYS(close(fd) == 0);
+ PG_EXPECT(!directory_contains("dir2", "test.txt"), "readdir doesn't see it after closing");
+}
+
+/*
+ * Tests for pread, pwrite, and vector variants.
+ */
+static void
+filesystem_io_tests(void)
+{
+ int fd;
+ char path[MAXPGPATH];
+ char buffer[80];
+ char buffer2[80];
+ struct iovec iov[8];
+
+ /*
+ * These are our own wrappers on Windows. On Unix, they're just macros
+ * for system APIs, except on Solaris which shares the pg_{read,write}v
+ * replacements used on Windows.
+ */
+
+ make_path(path, "io_test_file.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "create a new file");
+
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "initial file position is zero");
+
+ PG_EXPECT_EQ(pg_pwrite(fd, "llo", 3, 2), 3);
+ PG_EXPECT_EQ(pg_pwrite(fd, "he", 2, 0), 2);
+
+ /*
+ * On Windows, the current position moves, which is why we have the pg_
+ * prefixes as a warning to users. Demonstrate that difference here.
+ */
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 2, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ memset(buffer, 0, sizeof(buffer));
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 2, 3), 2);
+ PG_EXPECT(memcmp(buffer, "lo", 2) == 0);
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 3, 0), 3);
+ PG_EXPECT(memcmp(buffer, "hel", 3) == 0);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 3, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 80, 0), 5, "pg_pread short read");
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 80, 5), 0, "pg_pread EOF");
+
+ iov[0].iov_base = "wo";
+ iov[0].iov_len = 2;
+ iov[1].iov_base = "rld";
+ iov[1].iov_len = 3;
+ PG_EXPECT_EQ(pg_pwritev(fd, iov, 2, 5), 5);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 10, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ memset(buffer, 0, sizeof(buffer));
+ memset(buffer2, 0, sizeof(buffer2));
+ iov[0].iov_base = buffer;
+ iov[0].iov_len = 4;
+ iov[1].iov_base = buffer2;
+ iov[1].iov_len = 4;
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 1), 8);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 9, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ PG_EXPECT(memcmp(buffer, "ello", 4) == 0);
+ PG_EXPECT(memcmp(buffer2, "worl", 4) == 0);
+
+ memset(buffer, 0, sizeof(buffer));
+ memset(buffer2, 0, sizeof(buffer2));
+ iov[0].iov_base = buffer;
+ iov[0].iov_len = 1;
+ iov[1].iov_base = buffer2;
+ iov[1].iov_len = 80;
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 8), 2);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 9), 1);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 10), 0);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 11), 0);
+
+ memset(buffer, 0, sizeof(buffer));
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 10, 0), 10);
+ PG_EXPECT_EQ_STR(buffer, "helloworld");
+
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ /* Demonstrate the effects of "text" mode (no PG_BINARY flag). */
+
+ /* Write out a message in text mode. */
+ fd = open(path, O_RDWR, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_EXPECT_EQ(write(fd, "hello world\n", 12), 12);
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ /* Read it back in binary mode, to reveal the translation. */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+#ifdef WIN32
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 13,
+ "Windows: \\n was translated");
+ PG_EXPECT(memcmp(buffer, "hello world\r\n", 13) == 0,
+ "Windows: \\n was translated to \\r\\n");
+#else
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 12,
+ "POSIX: \\n was not translated");
+ PG_EXPECT(memcmp(buffer, "hello world\n", 12) == 0,
+ "POSIX: \\n is \\n");
+#endif
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ /* The opposite translation happens in text mode, hiding it. */
+ fd = open(path, O_RDWR, 0777);
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 12);
+ PG_EXPECT(memcmp(buffer, "hello world\n", 12) == 0);
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+
+ /*
+ * Write out a message in text mode, this time with pg_pwrite(), which
+ * does not suffer from newline translation.
+ */
+ fd = open(path, O_RDWR | O_CREAT, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_EXPECT_EQ(pg_pwrite(fd, "hello world\n", 12, 0), 12);
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ /* Read it back in binary mode to verify that. */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_EQ(pg_pread(fd, buffer, sizeof(buffer), 0), 12,
+ "\\n was not translated by pg_pread()");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+}
+
+/*
+ * Exercise fsync and fdatasync.
+ */
+static void
+filesystem_sync_tests(void)
+{
+ int fd;
+ char path[MAXPGPATH];
+
+ make_path(path, "sync_me.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+
+ PG_REQUIRE_SYS(write(fd, "x", 1) == 1);
+ PG_EXPECT_SYS(fsync(fd) == 0);
+
+ PG_REQUIRE_SYS(write(fd, "x", 1) == 1);
+ PG_EXPECT_SYS(fdatasync(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+}
+
+int
+main()
+{
+ PG_BEGIN_TESTS();
+
+ filesystem_metadata_tests();
+ filesystem_io_tests();
+ filesystem_sync_tests();
+
+ PG_END_TESTS();
+}
diff --git a/src/port/t/meson.build b/src/port/t/meson.build
new file mode 100644
index 0000000000..bc5c7fc0b8
--- /dev/null
+++ b/src/port/t/meson.build
@@ -0,0 +1,15 @@
+port_tests = []
+port_tests += executable('001_filesystem',
+ ['001_filesystem.c'],
+ dependencies: [frontend_code],
+ kwargs: test_bin_args,
+)
+
+tests += {
+ 'name': 'port',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'exec_tap': {
+ 'tests': port_tests,
+ },
+}
--
2.37.3
0006-Fix-lstat-on-broken-junction-points.patchtext/x-patch; charset=US-ASCII; name=0006-Fix-lstat-on-broken-junction-points.patchDownload
From b61a66f874d3b8151158e7956a3272e9259a8941 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 16 Oct 2022 12:07:33 +1300
Subject: [PATCH 06/10] Fix lstat() on broken junction points.
When using junction points to emulate symlinks on Windows, one edge case
was not handled correctly by commit c5cb8f3b: if a junction point is
broken (pointing to a non-existent path), we'd report ENOENT. This
doesn't break any known use case, but was noticed while testing and is
fixed here for completeness.
Also add translation ERROR_CANT_RESOLVE_FILENAME -> ENOENT, as that is
one of the errors Windows can report depending on format of the broken
path.
---
src/port/t/001_filesystem.c | 7 +++++++
src/port/win32error.c | 3 +++
src/port/win32stat.c | 27 ++++++++++++++++++++++-----
3 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index c2832b0a20..28ef069ee0 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -273,6 +273,13 @@ filesystem_metadata_tests(void)
PG_EXPECT(S_ISLNK(statbuf.st_mode));
PG_EXPECT_EQ(statbuf.st_size, strlen(path2), "got expected symlink size");
+ make_path(path, "broken-symlink");
+ make_path(path2, "does-not-exist");
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "make a broken symlink");
+ PG_EXPECT_SYS(lstat(path, &statbuf) == 0, "lstat broken symlink");
+ PG_EXPECT(S_ISLNK(statbuf.st_mode));
+ PG_EXPECT_SYS(unlink(path) == 0);
+
/* Tests for link() and unlink(). */
make_path(path, "does-not-exist-1");
diff --git a/src/port/win32error.c b/src/port/win32error.c
index a78d323827..67ce805d77 100644
--- a/src/port/win32error.c
+++ b/src/port/win32error.c
@@ -167,6 +167,9 @@ static const struct
},
{
ERROR_INVALID_NAME, ENOENT
+ },
+ {
+ ERROR_CANT_RESOLVE_FILENAME, ENOENT
}
};
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 5f3d0d22ff..ce8d87093d 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -125,15 +125,30 @@ _pglstat64(const char *name, struct stat *buf)
hFile = pgwin32_open_handle(name, O_RDONLY, true);
if (hFile == INVALID_HANDLE_VALUE)
- return -1;
-
- ret = fileinfo_to_stat(hFile, buf);
+ {
+ if (errno == ENOENT)
+ {
+ /*
+ * If it's a junction point pointing to a non-existent path, we'll
+ * have ENOENT here (because pgwin32_open_handle does not use
+ * FILE_FLAG_OPEN_REPARSE_POINT). In that case, we'll try again
+ * with readlink() below, which will distinguish true ENOENT from
+ * pseudo-symlink.
+ */
+ memset(buf, 0, sizeof(*buf));
+ ret = 0;
+ }
+ else
+ return -1;
+ }
+ else
+ ret = fileinfo_to_stat(hFile, buf);
/*
* Junction points appear as directories to fileinfo_to_stat(), so we'll
* need to do a bit more work to distinguish them.
*/
- if (ret == 0 && S_ISDIR(buf->st_mode))
+ if ((ret == 0 && S_ISDIR(buf->st_mode)) || hFile == INVALID_HANDLE_VALUE)
{
char next[MAXPGPATH];
ssize_t size;
@@ -169,10 +184,12 @@ _pglstat64(const char *name, struct stat *buf)
buf->st_mode &= ~S_IFDIR;
buf->st_mode |= S_IFLNK;
buf->st_size = size;
+ ret = 0;
}
}
- CloseHandle(hFile);
+ if (hFile != INVALID_HANDLE_VALUE)
+ CloseHandle(hFile);
return ret;
}
--
2.37.3
0007-Fix-readlink-for-non-PostgreSQL-created-junction-poi.patchtext/x-patch; charset=US-ASCII; name=0007-Fix-readlink-for-non-PostgreSQL-created-junction-poi.patchDownload
From 664fa8d965151b4cbf1e0408b2944cb8b7d5bc22 Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Thu, 11 Aug 2022 10:42:13 +1200
Subject: [PATCH 07/10] Fix readlink() for non-PostgreSQL-created junction
points.
Our symlink() and readlink() replacements perform a naive transformation
from DOS paths to NT paths and back, as required by the junction point
APIs. This was enough for the "drive absolute" paths we expect users to
provide for tablespaces, for example "d:\my\fast\storage".
Since commit c5cb8f3b taught stat() to follow symlinks, and since initdb
uses pg_mkdir_p(), and that examines parent directories, our humble
readlink() implementation can now be exposed to junction points not of
PostgreSQL origin. Those might be corrupted by our naive path mangling,
which doesn't really understand NT paths in general.
Simply decline to transform paths that don't look like a drive absolute
path. That means that readlink() returns the NT path directly when
checking a parent directory of PGDATA that happen to point to a drive
using "rooted" format. That works for the purposes of our stat()
emulation.
Reported-by: Roman Zharkov <r.zharkov@postgrespro.ru>
Reviewed-by: Roman Zharkov <r.zharkov@postgrespro.ru>
Discussion: https://postgr.es/m/4590c37927d7b8ee84f9855d83229018%40postgrespro.ru
---
src/port/dirmod.c | 17 ++++++++++---
src/port/t/001_filesystem.c | 51 +++++++++++++++++++++++++++++++++++++
2 files changed, 65 insertions(+), 3 deletions(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index 3e1c78fae6..f7fd4c15ad 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -366,10 +366,21 @@ pgreadlink(const char *path, char *buf, size_t size)
r -= 1;
/*
- * If the path starts with "\??\", which it will do in most (all?) cases,
- * strip those out.
+ * If the path starts with "\??\" followed by a "drive absolute" path
+ * (known to Windows APIs as RtlPathTypeDriveAbsolute), then strip that
+ * prefix. This undoes some of the transformation performed by
+ * pqsymlink(), to get back to a format that users are used to seeing. We
+ * don't know how to transform other path types that might be encountered
+ * outside PGDATA, so we just return them directly.
*/
- if (r > 4 && strncmp(buf, "\\??\\", 4) == 0)
+ if (r >= 7 &&
+ buf[0] == '\\' &&
+ buf[1] == '?' &&
+ buf[2] == '?' &&
+ buf[3] == '\\' &&
+ isalpha(buf[4]) &&
+ buf[5] == ':' &&
+ buf[6] == '\\')
{
memmove(buf, buf + 4, strlen(buf + 4) + 1);
r -= 4;
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 28ef069ee0..4178b8e1f2 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -216,6 +216,57 @@ filesystem_metadata_tests(void)
PG_EXPECT(readlink("does-not-exist", path3, sizeof(path3)) == -1, "readlink fails on missing path");
PG_EXPECT_EQ(errno, ENOENT);
+ /*
+ * Checks that we don't corrupt non-drive-absolute paths when peforming
+ * internal conversions.
+ */
+
+ /*
+ * Typical case: Windows drive absolute. This should also be accepted on
+ * POSIX systems, because they are required not to validate the target
+ * string as a path.
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("c:\\foo", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, 6);
+ path3[size] = '\0';
+ PG_EXPECT_EQ_STR(path3, "c:\\foo");
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
+ /*
+ * Drive absolute given in full NT format will be stripped on round-trip
+ * through our Windows emulations.
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("\\??\\c:\\foo", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ path3[size] = '\0';
+#ifdef WIN32
+ PG_EXPECT_EQ(size, 6);
+ PG_EXPECT_EQ_STR(path3, "c:\\foo");
+#else
+ PG_EXPECT_EQ(size, 10);
+ PG_EXPECT_EQ_STR(path3, "\\??\\c:\\foo");
+#endif
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
+ /*
+ * Anything that doesn't look like the NT pattern that symlink() creates
+ * will be returned verbatim. This will allow our stat() to handle paths
+ * that were not created by symlink().
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("\\??\\Volume1234", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, 14);
+ path3[size] = '\0';
+ PG_EXPECT_EQ_STR(path3, "\\??\\Volume1234");
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
/* Tests for fstat(). */
make_path(path, "dir1/test.txt");
--
2.37.3
0008-Fix-stat-for-recursive-junction-points-on-Windows.patchtext/x-patch; charset=US-ASCII; name=0008-Fix-stat-for-recursive-junction-points-on-Windows.patchDownload
From d76d6152eaf323005829e1856915b1969c3a93b2 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 11 Aug 2022 12:30:48 +1200
Subject: [PATCH 08/10] Fix stat() for recursive junction points on Windows.
Commit c5cb8f3b supposed that we'd only ever have to follow one junction
point in stat(), because we don't construct longer chains of them ourselves.
When examining a parent directory supplied by the user, we should really be
able to cope with longer chains, just in case someone has their system
set up that way. Choose an arbitrary cap of 8, to match the minimum
acceptable value of SYMLOOP_MAX in POSIX.
Previously I'd avoided reporting ELOOP thinking Windows didn't have it,
but it turns out that it does.
Reviewed-by: Roman Zharkov <r.zharkov@postgrespro.ru>
Discussion: https://postgr.es/m/CA%2BhUKGJ7JDGWYFt9%3D-TyJiRRy5q9TtPfqeKkneWDr1XPU1%2Biqw%40mail.gmail.com
---
src/port/t/001_filesystem.c | 55 +++++++++++++++++++++++++++++++++++++
src/port/win32stat.c | 26 +++++++++---------
2 files changed, 68 insertions(+), 13 deletions(-)
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 4178b8e1f2..501d7e1603 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -301,6 +301,61 @@ filesystem_metadata_tests(void)
PG_EXPECT(stat(path, &statbuf) == 0, "stat symlink");
PG_EXPECT(S_ISDIR(statbuf.st_mode));
+ /* Recursive symlinks. */
+ make_path(path, "dir1");
+ make_path(path2, "sym001");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym001 -> dir1");
+ make_path(path, "sym001");
+ make_path(path2, "sym002");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym002 -> sym001");
+ make_path(path, "sym002");
+ make_path(path2, "sym003");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym003 -> sym002");
+ make_path(path, "sym003");
+ make_path(path2, "sym004");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym004 -> sym003");
+ make_path(path, "sym004");
+ make_path(path2, "sym005");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym005 -> sym004");
+ make_path(path, "sym005");
+ make_path(path2, "sym006");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym006 -> sym005");
+ make_path(path, "sym006");
+ make_path(path2, "sym007");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym007 -> sym006");
+ make_path(path, "sym007");
+ make_path(path2, "sym008");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym008 -> sym007");
+ make_path(path, "sym008");
+ make_path(path2, "sym009");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym009 -> sym008");
+
+ /* POSIX says SYMLOOP_MAX should be at least 8. */
+ make_path(path, "sym008");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT_SYS(stat(path, &statbuf) == 0, "stat sym008");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+#ifdef WIN32
+
+ /*
+ * Test ELOOP failure in our Windows implementation of stat(), because we
+ * know it gives up after 8.
+ */
+ make_path(path, "sym009");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == -1, "Windows: stat sym009 fails");
+ PG_EXPECT_EQ(errno, ELOOP);
+#endif
+
+ /* If we break the chain we get ENOENT. */
+ make_path(path, "sym003");
+ PG_EXPECT_SYS(unlink(path) == 0);
+ make_path(path, "sym008");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == -1, "stat broken symlink chain fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
/* Tests for lstat(). */
PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "lstat missing file fails");
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index ce8d87093d..e6553e4030 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -199,23 +199,33 @@ _pglstat64(const char *name, struct stat *buf)
int
_pgstat64(const char *name, struct stat *buf)
{
+ int loops = 0;
int ret;
+ char curr[MAXPGPATH];
ret = _pglstat64(name, buf);
+ strlcpy(curr, name, MAXPGPATH);
+
/* Do we need to follow a symlink (junction point)? */
- if (ret == 0 && S_ISLNK(buf->st_mode))
+ while (ret == 0 && S_ISLNK(buf->st_mode))
{
char next[MAXPGPATH];
ssize_t size;
+ if (++loops > 8)
+ {
+ errno = ELOOP;
+ return -1;
+ }
+
/*
* _pglstat64() already called readlink() once to be able to fill in
* st_size, and now we need to do it again to get the path to follow.
* That could be optimized, but stat() on symlinks is probably rare
* and this way is simple.
*/
- size = readlink(name, next, sizeof(next));
+ size = readlink(curr, next, sizeof(next));
if (size < 0)
{
if (errno == EACCES &&
@@ -234,17 +244,7 @@ _pgstat64(const char *name, struct stat *buf)
next[size] = 0;
ret = _pglstat64(next, buf);
- if (ret == 0 && S_ISLNK(buf->st_mode))
- {
- /*
- * We're only prepared to go one hop, because we only expect to
- * deal with the simple cases that we create. The error for too
- * many symlinks is supposed to be ELOOP, but Windows hasn't got
- * it.
- */
- errno = EIO;
- return -1;
- }
+ strcpy(curr, next);
}
return ret;
--
2.37.3
0009-Fix-unlink-for-STATUS_DELETE_PENDING-on-Windows.patchtext/x-patch; charset=US-ASCII; name=0009-Fix-unlink-for-STATUS_DELETE_PENDING-on-Windows.patchDownload
From 033e4898a076e65162be3dcd60badaa40ba8f736 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 3 Oct 2022 09:15:02 +1300
Subject: [PATCH 09/10] Fix unlink() for STATUS_DELETE_PENDING on Windows.
Commit c5cb8f3b didn't handle STATUS_DELETE_PENDING correctly, and would
report ENOENT immediately without waiting. If we called unlink(name)
twice in a row and then rmdir(parent) while someone had an open handle,
previously the second call to unlink() would block until that handle
went away, or a 10 second timeout expired. This change resulted in a
test occasionally failing on CI, which still has an OS without POSIX
semantics for unlink. Restore.
Diagnosed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20220920013122.GA31833%40telsasoft.com
---
src/port/dirmod.c | 38 +++++++++++++++++++++++++++++++++++--
src/port/t/001_filesystem.c | 36 +++++++++++++++++++++++++++++++++++
2 files changed, 72 insertions(+), 2 deletions(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index f7fd4c15ad..f4abde97bf 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -39,6 +39,10 @@
#endif
#endif
+#if defined(WIN32) && !defined(__CYGWIN__)
+#include "port/win32ntdll.h"
+#endif
+
#if defined(WIN32) || defined(__CYGWIN__)
/* Externally visable only to allow testing. */
@@ -94,6 +98,22 @@ pgrename(const char *from, const char *to)
return 0;
}
+/*
+ * Check if _pglstat64()'s reason for failure was STATUS_DELETE_PENDING.
+ * This doesn't apply to Cygwin, which has its own lstat() that would report
+ * the case as EACCES.
+*/
+static bool
+lstat_error_was_status_delete_pending(void)
+{
+ if (errno != ENOENT)
+ return false;
+#if defined(WIN32) && !defined(__CYGWIN__)
+ if (pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
+ return true;
+#endif
+ return false;
+}
/*
* pgunlink
@@ -101,6 +121,7 @@ pgrename(const char *from, const char *to)
int
pgunlink(const char *path)
{
+ bool is_lnk;
int loops = 0;
struct stat st;
@@ -125,9 +146,22 @@ pgunlink(const char *path)
* due to sharing violations, but that seems unlikely. We could perhaps
* prevent that by holding a file handle ourselves across the lstat() and
* the retry loop, but that seems like over-engineering for now.
+ *
+ * In the special case of a STATUS_DELETE_PENDING error (file already
+ * unlinked, but someone still has it open), we don't want to report ENOENT
+ * to the caller immediately, because rmdir(parent) would probably fail.
+ * We want to wait until the file truly goes away so that simple recursive
+ * directory unlink algorithms work.
*/
if (lstat(path, &st) < 0)
- return -1;
+ {
+ if (lstat_error_was_status_delete_pending())
+ is_lnk = false;
+ else
+ return -1;
+ }
+ else
+ is_lnk = S_ISLNK(st.st_mode);
/*
* We need to loop because even though PostgreSQL uses flags that allow
@@ -136,7 +170,7 @@ pgunlink(const char *path)
* someone else to close the file, as the caller might be holding locks
* and blocking other backends.
*/
- while ((S_ISLNK(st.st_mode) ? rmdir(path) : unlink(path)) < 0)
+ while ((is_lnk ? rmdir(path) : unlink(path)) < 0)
{
if (errno != EACCES)
return -1;
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 501d7e1603..346978783a 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -472,6 +472,42 @@ filesystem_metadata_tests(void)
PG_EXPECT_SYS(unlink(path) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
#endif
+ /*
+ * Our Windows unlink() wrapper blocks in a retry loop if you try to
+ * unlink a file in STATUS_DELETE_PENDING (ie that has already been
+ * unlinked but is still open), until it times out with EACCES or reaches
+ * ENOENT. That may be useful for waiting for files to be asynchronously
+ * unlinked while performing a recursive unlink on
+ * !have_posix_unlink_semantics systems, so that rmdir(parent) works.
+ */
+ make_path(path, "dir2");
+ PG_EXPECT_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir2/test-file");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open dir2/test-file");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink file while it's open, once");
+#ifdef WIN32
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+#endif
+ PG_EXPECT(unlink(path) == -1, "can't unlink again");
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_EQ(errno, ENOENT, "POSIX: we expect ENOENT");
+ }
+ else
+ {
+ PG_EXPECT_EQ(errno, EACCES, "Windows non-POSIX: we expect EACCES (delete already pending)");
+#ifdef WIN32
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure
+ * our 100ms callback is run */
+ run_async_procedure_after_delay(close_fd, &fd, 100); /* close fd after 100ms */
+#endif
+ PG_EXPECT(unlink(path) == -1, "Windows non-POSIX: trying again fails");
+ PG_EXPECT_EQ(errno, ENOENT, "Windows non-POSIX: ... but blocked until ENOENT was reached due to asynchronous close");
+ }
+ make_path(path, "dir2");
+ PG_EXPECT_SYS(rmdir(path) == 0, "now we can remove the directory");
+
/* Tests for rename(). */
make_path(path, "name1.txt");
--
2.37.3
0010-Use-POSIX-semantics-for-unlink-and-rename-on-Windows.patchtext/x-patch; charset=US-ASCII; name=0010-Use-POSIX-semantics-for-unlink-and-rename-on-Windows.patchDownload
From ac3518414215b08ed49be797bd6c57efbe36f07f Mon Sep 17 00:00:00 2001
From: Thomas <thomas.munro@gmail.com>
Date: Mon, 17 Oct 2022 22:41:18 -0700
Subject: [PATCH 10/10] Use POSIX semantics for unlink() and rename() on
Windows.
Use SetInformationByHandle() directly to implement unlink and rename
with POSIX semantics.
XXX Does this work on all relevant filesystems, and if not, how does it
fail?
Author: Victor Spirin <v.spirin@postgrespro.ru>
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/a529b660-da15-5b62-21a0-9936768210fd%40postgrespro.ru
---
src/port/dirmod.c | 163 +++++++++++++++++++++++-------------
src/port/t/001_filesystem.c | 31 +++----
2 files changed, 121 insertions(+), 73 deletions(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index f4abde97bf..b96b0dc943 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -39,8 +39,13 @@
#endif
#endif
-#if defined(WIN32) && !defined(__CYGWIN__)
-#include "port/win32ntdll.h"
+#if defined(WIN32)
+/*
+ * XXX figure out how to get these from a system header
+ * https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/ns-ntddk-_file_disposition_information_ex
+ */
+#define FILE_DISPOSITION_DELETE 0x00000001
+#define FILE_DISPOSITION_POSIX_SEMANTICS 0x00000002
#endif
#if defined(WIN32) || defined(__CYGWIN__)
@@ -48,6 +53,53 @@
/* Externally visable only to allow testing. */
int pgwin32_dirmod_loops = 100;
+#ifdef WIN32
+
+typedef struct FILE_RENAME_INFO_EXT {
+ FILE_RENAME_INFO fri;
+ wchar_t extra_space[MAXPGPATH];
+} FILE_RENAME_INFO_EXT;
+
+static int
+pgwin32_posix_rename(const char *from, const char *to)
+{
+ FILE_RENAME_INFO_EXT rename_info = {{0}};
+ HANDLE handle;
+
+ if (MultiByteToWideChar(CP_ACP, 0, to, -1, rename_info.fri.FileName, MAXPGPATH) == 0)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ rename_info.fri.ReplaceIfExists = true;
+ rename_info.fri.Flags = FILE_RENAME_FLAG_POSIX_SEMANTICS | FILE_RENAME_FLAG_REPLACE_IF_EXISTS;
+ rename_info.fri.FileNameLength = wcslen(rename_info.fri.FileName);
+
+ handle = CreateFile(from,
+ GENERIC_READ | GENERIC_WRITE | DELETE,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+ if (handle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ if (!SetFileInformationByHandle(handle, FileRenameInfoEx, &rename_info, sizeof(FILE_RENAME_INFO_EXT)))
+ {
+ _dosmaperr(GetLastError());
+ CloseHandle(handle);
+ return -1;
+ }
+ CloseHandle(handle);
+ return 0;
+}
+
+#endif
+
/*
* pgrename
*/
@@ -64,7 +116,7 @@ pgrename(const char *from, const char *to)
* and blocking other backends.
*/
#if defined(WIN32) && !defined(__CYGWIN__)
- while (!MoveFileEx(from, to, MOVEFILE_REPLACE_EXISTING))
+ while (pgwin32_posix_rename(from, to) < 0)
#else
while (rename(from, to) < 0)
#endif
@@ -98,70 +150,61 @@ pgrename(const char *from, const char *to)
return 0;
}
-/*
- * Check if _pglstat64()'s reason for failure was STATUS_DELETE_PENDING.
- * This doesn't apply to Cygwin, which has its own lstat() that would report
- * the case as EACCES.
-*/
-static bool
-lstat_error_was_status_delete_pending(void)
-{
- if (errno != ENOENT)
- return false;
#if defined(WIN32) && !defined(__CYGWIN__)
- if (pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
- return true;
-#endif
- return false;
+
+static int
+pgwin32_posix_unlink(const char *path)
+{
+ BY_HANDLE_FILE_INFORMATION info;
+ HANDLE handle;
+ ULONG flags;
+
+ flags = FILE_DISPOSITION_DELETE | FILE_DISPOSITION_POSIX_SEMANTICS;
+ handle = CreateFile(path,
+ GENERIC_READ | GENERIC_WRITE | DELETE,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+ if (handle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ if (!GetFileInformationByHandle(handle, &info))
+ {
+ _dosmaperr(GetLastError());
+ CloseHandle(handle);
+ return -1;
+ }
+ /* Let junction points be unlinked this way, but not directories. */
+ if ((info.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) &&
+ !(info.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT))
+ {
+ CloseHandle(handle);
+ errno = EPERM;
+ return -1;
+ }
+ if (!SetFileInformationByHandle(handle, FileDispositionInfoEx, &flags, sizeof(flags)))
+ {
+ _dosmaperr(GetLastError());
+ CloseHandle(handle);
+ return -1;
+ }
+ CloseHandle(handle);
+ return 0;
}
+#endif
+
/*
* pgunlink
*/
int
pgunlink(const char *path)
{
- bool is_lnk;
int loops = 0;
- struct stat st;
-
- /*
- * This function might be called for a regular file or for a junction
- * point (which we use to emulate symlinks). The latter must be unlinked
- * with rmdir() on Windows. Before we worry about any of that, let's see
- * if we can unlink directly, since that's expected to be the most common
- * case.
- */
- if (unlink(path) == 0)
- return 0;
- if (errno != EACCES)
- return -1;
-
- /*
- * EACCES is reported for many reasons including unlink() of a junction
- * point. Check if that's the case so we can redirect to rmdir().
- *
- * Note that by checking only once, we can't cope with a path that changes
- * from regular file to junction point underneath us while we're retrying
- * due to sharing violations, but that seems unlikely. We could perhaps
- * prevent that by holding a file handle ourselves across the lstat() and
- * the retry loop, but that seems like over-engineering for now.
- *
- * In the special case of a STATUS_DELETE_PENDING error (file already
- * unlinked, but someone still has it open), we don't want to report ENOENT
- * to the caller immediately, because rmdir(parent) would probably fail.
- * We want to wait until the file truly goes away so that simple recursive
- * directory unlink algorithms work.
- */
- if (lstat(path, &st) < 0)
- {
- if (lstat_error_was_status_delete_pending())
- is_lnk = false;
- else
- return -1;
- }
- else
- is_lnk = S_ISLNK(st.st_mode);
/*
* We need to loop because even though PostgreSQL uses flags that allow
@@ -170,7 +213,11 @@ pgunlink(const char *path)
* someone else to close the file, as the caller might be holding locks
* and blocking other backends.
*/
- while ((is_lnk ? rmdir(path) : unlink(path)) < 0)
+#ifdef WIN32
+ while (pgwin32_posix_unlink(path) < 0)
+#else
+ while (unlink(path) < 0)
+#endif
{
if (errno != EACCES)
return -1;
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 346978783a..e7997278ec 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -438,10 +438,10 @@ filesystem_metadata_tests(void)
* Linux), though AIX/JFS1 is rumored to succeed. However, our Windows
* emulation doesn't allow it, because we want to avoid surprises by
* behaving like nearly all Unix systems. So we check this on Windows
- * only, where it fails with non-standard EACCES.
+ * only, where our wrapper fails with EPERM.
*/
PG_EXPECT_SYS(unlink(path2) == -1, "Windows: can't unlink() a directory");
- PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_EQ(errno, EPERM);
#endif
#ifdef WIN32
@@ -535,21 +535,22 @@ filesystem_metadata_tests(void)
fd = open(path2, O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "open name2.txt");
make_path(path2, "name2.txt");
-#ifdef WIN32
- /*
- * Windows can't rename over an open non-unlinked file, even with
- * have_posix_unlink_semantics.
- */
- pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
- PG_EXPECT_SYS(rename(path, path2) == -1,
- "Windows: can't rename name1.txt -> name2.txt while name2.txt is open");
- PG_EXPECT_EQ(errno, EACCES);
- PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
-#else
- PG_EXPECT_SYS(rename(path, path2) == 0,
- "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+ if (!have_posix_unlink_semantics)
+ {
+#ifdef WIN32
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
#endif
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows non-POSIX: can't rename name1.txt -> name2.txt while name2.txt is open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
+ }
+ else
+ {
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+ }
PG_EXPECT_SYS(close(fd) == 0);
make_path(path, "name1.txt");
--
2.37.3
On Tue, Oct 18, 2022 at 10:00 PM Thomas Munro <thomas.munro@gmail.com> wrote:
* has anyone got a relevant filesystem where this fails? which way
do ReFS and SMB go? do the new calls in 0010 just fail, and if so
with which code (ie could we add our own fallback path)?
Andres kindly ran these tests on some Win 10 and Win 11 VMs he had
with non-NTFS filesystems, so I can report:
NTFS: have_posix_unlink_semantics == true, tests passing
ReFS: have_posix_unlink_semantics == false, tests passing
SMB: have_posix_unlink_semantics == false, symlink related tests
failing (our junction points are rejected) + one readdir() test
failing (semantic difference introduced by SMB, it can't see
STATUS_DELETE_PENDING zombies).
I think this means that PostgreSQL probably mostly works on SMB today,
except you can't create tablespaces, and therefore our regression
tests etc already can't pass there, and there may be a few extra
ENOTEMPTY race conditions due to readdir()'s different behaviour.
* if there are any filesystems that don't support POSIX-semantics,
would we want to either (1) get such a thing into the build farm so
it's tested or (2) de-support non-POSIX-semantics filesystems by
edict, and drop a lot of code and problems that everyone hates?
Yes, yes there are, so this question comes up. Put another way:
I guess that almost all users of PostgreSQL on Windows are using NTFS.
Some are getting partial POSIX semantics already, and some are not,
depending on the Windows variant. If we commit the 0010 patch, all
supported OSes will get full POSIX unlink semantics on NTFS. That'd
leave just ReFS and SMB users (are there any other relevant
filesystems?) in the cold with non-POSIX semantics. Do we want to
claim that we support those filesystems? If so, I guess we'd need an
animal and perhaps also optional CI with ReFS. (Though ReFS may
eventually get POSIX semantics too, I have no idea about that.) If
not, we could in theory rip out various code we have to cope with the
non-POSIX unlink semantics, and completely forget about that whole
category of problem.
Changes in this version:
* try to avoid tests that do bad things that crash if earlier tests
failed (I learned that close(-1) aborts in debug builds)
* add fallback paths in 0010 (I learned what errors are raised on lack
of POSIX support)
* fix MinGW build problems
As far as I could tell, MinGW doesn't have a struct definition we
need, and it seems to want _WIN32_WINNT >= 0x0A000002 to see
FileRenameInfoEx, which looks weird to me... (I'm not sure about that,
but I think that was perhaps supposed to be 0x0A02, but even that
isn't necessary with MSVC SDK headers). I gave up researching that
and put the definitions I needed into the code.
Attachments:
v2-0001-Add-suite-of-macros-for-writing-TAP-tests-in-C.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Add-suite-of-macros-for-writing-TAP-tests-in-C.patchDownload
From fc46ee76fa3aa0e9b24218126168db14acc8222d Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 11 Oct 2022 17:26:02 +1300
Subject: [PATCH v2 01/10] Add suite of macros for writing TAP tests in C.
Historically we had to load test modules into a PostgreSQL backend or
write an ad-hoc standalone program to test C code. Let's provide a
convenient way to write stand-alone tests for low-level C code.
This commit defines a small set of macros that spit out results in TAP
format. These can be improved and extended as required.
---
src/include/lib/pg_tap.h | 138 +++++++++++++++++++++++++++++++++++++++
1 file changed, 138 insertions(+)
create mode 100644 src/include/lib/pg_tap.h
diff --git a/src/include/lib/pg_tap.h b/src/include/lib/pg_tap.h
new file mode 100644
index 0000000000..3da46ac5d5
--- /dev/null
+++ b/src/include/lib/pg_tap.h
@@ -0,0 +1,138 @@
+/*
+ * Simple macros for writing tests in C that print results in TAP format,
+ * as consumed by "prove".
+ *
+ * Specification for the output format: https://testanything.org/
+ */
+
+#ifndef PG_TAP_H
+#define PG_TAP_H
+
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+
+/* Counters are global, so we can break our tests into multiple functions. */
+static int pg_test_count;
+static int pg_fail_count;
+static int pg_todo_count;
+
+/*
+ * Require an expression to be true. Used for set-up steps that are not
+ * reported as a test.
+ */
+#define PG_REQUIRE(expr) \
+if (!(expr)) { \
+ printf("Bail out! requirement (" #expr ") failed at %s:%d\n", \
+ __FILE__, __LINE__); \
+ exit(EXIT_FAILURE); \
+}
+
+/*
+ * Like PG_REQUIRE, but log strerror(errno) before bailing.
+ */
+#define PG_REQUIRE_SYS(expr) \
+if (!(expr)) { \
+ printf("Bail out! requirement (" #expr ") failed at %s:%d, error: %s\n", \
+ __FILE__, __LINE__, strerror(errno)); \
+ exit(EXIT_FAILURE); \
+}
+
+/*
+ * Test that an expression is true. An optional message can be provided,
+ * defaulting to the expression itself if not provided.
+ */
+#define PG_EXPECT(expr, ...) \
+do { \
+ const char *messages[] = {#expr, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ pg_test_count++; \
+ if (expr) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s (at %s:%d)%s\n", pg_test_count, \
+ message, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * Like PG_EXPECT(), but also log strerror(errno) on failure.
+ */
+#define PG_EXPECT_SYS(expr, ...) \
+do { \
+ const char *messages[] = {#expr, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ pg_test_count++; \
+ if (expr) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s (at %s:%d), error: %s%s\n", pg_test_count, \
+ message, __FILE__, __LINE__, strerror(errno), \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+
+/*
+ * Test that one int expression is equal to another, logging the values if not.
+ */
+#define PG_EXPECT_EQ(expr1, expr2, ...) \
+do { \
+ const char *messages[] = {#expr1 " == " #expr2, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ int expr1_val = (expr1); \
+ int expr2_val = (expr2); \
+ pg_test_count++; \
+ if (expr1_val == expr2_val) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s: %d != %d (at %s:%d)%s\n", pg_test_count, \
+ message, expr1_val, expr2_val, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * Test that one C string expression is equal to another, logging the values if
+ * not.
+ */
+#define PG_EXPECT_EQ_STR(expr1, expr2, ...) \
+do { \
+ const char *messages[] = {#expr1 " matches " #expr2, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ const char *expr1_val = (expr1); \
+ const char *expr2_val = (expr2); \
+ pg_test_count++; \
+ if (strcmp(expr1_val, expr2_val) == 0) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s: \"%s\" vs \"%s\" (at %s:%d) %s\n", \
+ pg_test_count, \
+ message, expr1_val, expr2_val, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * The main function of a test program should begin and end with these
+ * functions.
+ */
+#define PG_BEGIN_TESTS() \
+ setbuf(stdout, NULL); /* disable buffering for Windows */
+
+#define PG_END_TESTS() printf("1..%d\n", pg_test_count); \
+ return EXIT_SUCCESS
+
+#define PG_BEGIN_TODO() pg_todo_count++
+#define PG_END_TODO() pg_todo_count--
+
+#endif
--
2.35.1
v2-0002-meson-Add-infrastructure-for-TAP-tests-written-in.patchtext/x-patch; charset=US-ASCII; name=v2-0002-meson-Add-infrastructure-for-TAP-tests-written-in.patchDownload
From 1456e325f7614a753a74dda61babb06867dea80b Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 11 Oct 2022 10:49:15 -0700
Subject: [PATCH v2 02/10] meson: Add infrastructure for TAP tests written in
C.
Allow t/XXX.c files to be used as t/XXX.pl files normally are.
Author: Andres Freund <andres@anarazel.de>
---
meson.build | 43 +++++++++++++++++++++++----
src/interfaces/libpq/test/meson.build | 8 ++---
src/tools/testwrap | 11 +++++--
3 files changed, 48 insertions(+), 14 deletions(-)
diff --git a/meson.build b/meson.build
index 2d225f706d..ada774da84 100644
--- a/meson.build
+++ b/meson.build
@@ -2560,6 +2560,10 @@ default_bin_args = default_target_args + {
'install_rpath': ':'.join(bin_install_rpaths),
}
+test_bin_args = default_target_args + {
+ 'install': false,
+}
+
# Helper for exporting a limited number of symbols
@@ -2919,7 +2923,14 @@ foreach test_dir : tests
endif
t = test_dir[kind]
+ env = test_env
+ # Add environment vars from the test specification
+ foreach name, value : t.get('env', {})
+ env.set(name, value)
+ endforeach
+
+ # pg_regress style tests
if kind in ['regress', 'isolation', 'ecpg']
if kind == 'regress'
runner = pg_regress
@@ -2953,7 +2964,6 @@ foreach test_dir : tests
test_command += t.get('sql', [])
endif
- env = test_env
env.prepend('PATH', temp_install_bindir, test_dir['bd'])
test_kwargs = {
@@ -2974,6 +2984,8 @@ foreach test_dir : tests
)
testport += 1
+
+ # perl tap tests
elif kind == 'tap'
if not tap_tests_enabled
continue
@@ -2987,13 +2999,8 @@ foreach test_dir : tests
# Add temporary install, the build directory for non-installed binaries and
# also test/ for non-installed test binaries built separately.
- env = test_env
env.prepend('PATH', temp_install_bindir, test_dir['bd'], test_dir['bd'] / 'test')
- foreach name, value : t.get('env', {})
- env.set(name, value)
- endforeach
-
test_kwargs = {
'protocol': 'tap',
'suite': [test_dir['name']],
@@ -3022,6 +3029,30 @@ foreach test_dir : tests
],
)
endforeach
+
+ # tap tests using C executables
+ elif kind == 'exec_tap'
+ # Don't have a dependency on perl tap test infrastructure, so we can
+ # test even when not tap_test_enabled.
+ test_kwargs = {
+ 'protocol': 'tap',
+ 'suite': [test_dir['name']],
+ 'timeout': 1000,
+ 'env': env,
+ } + t.get('test_kwargs', {})
+
+ foreach onetap : t['tests']
+ test_name = onetap.name()
+ test(test_dir['name'] / test_name,
+ python,
+ kwargs: test_kwargs,
+ depends: test_deps + t.get('deps', []) + [onetap],
+ args: testwrap_base + [
+ '--testname', test_name,
+ '--', onetap.full_path(),
+ ],
+ )
+ endforeach
else
error('unknown kind @0@ of test in @1@'.format(kind, test_dir['sd']))
endif
diff --git a/src/interfaces/libpq/test/meson.build b/src/interfaces/libpq/test/meson.build
index 017f729d43..06e026e400 100644
--- a/src/interfaces/libpq/test/meson.build
+++ b/src/interfaces/libpq/test/meson.build
@@ -11,9 +11,7 @@ endif
executable('libpq_uri_regress',
libpq_uri_regress_sources,
dependencies: [frontend_code, libpq],
- kwargs: default_bin_args + {
- 'install': false,
- }
+ kwargs: test_bin_args,
)
@@ -30,7 +28,5 @@ endif
executable('libpq_testclient',
libpq_testclient_sources,
dependencies: [frontend_code, libpq],
- kwargs: default_bin_args + {
- 'install': false,
- }
+ kwargs: default_bin_args,
)
diff --git a/src/tools/testwrap b/src/tools/testwrap
index 7a64fe76a2..5ca49b1fac 100755
--- a/src/tools/testwrap
+++ b/src/tools/testwrap
@@ -32,9 +32,16 @@ os.chdir(args.srcdir)
# mark test as having started
open(os.path.join(testdir, 'test.start'), 'x')
+testdatadir = os.path.join(testdir, 'data')
+testlogdir = os.path.join(testdir, 'log')
+
+# pre-create dirs so that the test's infrastructure doesn't have to
+os.makedirs(testdatadir)
+os.makedirs(testlogdir)
+
env_dict = {**os.environ,
- 'TESTDATADIR': os.path.join(testdir, 'data'),
- 'TESTLOGDIR': os.path.join(testdir, 'log')}
+ 'TESTDATADIR': testdatadir,
+ 'TESTLOGDIR': testlogdir}
sp = subprocess.run(args.test_command, env=env_dict)
--
2.35.1
v2-0003-Fix-symlink-errno-in-Windows-replacement-code.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Fix-symlink-errno-in-Windows-replacement-code.patchDownload
From e413686a5ee5df4a54a6ecd1fe4c7c3ceab14ba4 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Wed, 12 Oct 2022 16:34:09 +1300
Subject: [PATCH v2 03/10] Fix symlink() errno in Windows replacement code.
Ancient bug noticed while adding tests for these functions.
---
src/port/dirmod.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index ae6301dd6c..51c9bded8f 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -197,7 +197,10 @@ pgsymlink(const char *oldpath, const char *newpath)
FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, 0);
if (dirhandle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
return -1;
+ }
/* make sure we have an unparsed native win32 path */
if (memcmp("\\??\\", oldpath, 4) != 0)
@@ -230,8 +233,11 @@ pgsymlink(const char *oldpath, const char *newpath)
0, 0, &len, 0))
{
LPSTR msg;
+ int save_errno;
+
+ _dosmaperr(GetLastError());
+ save_errno = errno;
- errno = 0;
FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_IGNORE_INSERTS |
FORMAT_MESSAGE_FROM_SYSTEM,
@@ -251,6 +257,9 @@ pgsymlink(const char *oldpath, const char *newpath)
CloseHandle(dirhandle);
RemoveDirectory(newpath);
+
+ errno = save_errno;
+
return -1;
}
--
2.35.1
v2-0004-Fix-readlink-return-value-on-Windows.patchtext/x-patch; charset=US-ASCII; name=v2-0004-Fix-readlink-return-value-on-Windows.patchDownload
From 3919ab16b5630498d88f5a16d152c1b3ba235377 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Wed, 12 Oct 2022 17:22:53 +1300
Subject: [PATCH v2 04/10] Fix readlink() return value on Windows.
It accidentally returned a value that counted the trailing nul
terminator.
This is an ancient bug, but it's benign.
---
src/port/dirmod.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index 51c9bded8f..5881024929 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -359,6 +359,9 @@ pgreadlink(const char *path, char *buf, size_t size)
return -1;
}
+ /* r includes the nul terminator */
+ r -= 1;
+
/*
* If the path starts with "\??\", which it will do in most (all?) cases,
* strip those out.
--
2.35.1
v2-0005-Add-tests-for-Windows-filesystem-code-in-src-port.patchtext/x-patch; charset=US-ASCII; name=v2-0005-Add-tests-for-Windows-filesystem-code-in-src-port.patchDownload
From a95eda277965bc0aa8fde4f73330a45d017e1139 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 11 Oct 2022 17:26:02 +1300
Subject: [PATCH v2 05/10] Add tests for Windows filesystem code in src/port.
The wrappers we use for open(), unlink() etc had no direct testing.
These tests demonstrate the behavior of our Windows porting layer, with
two variants of underlying Windows behavior that are known to exist in
current versions of the operating system.
Since some of the functions contain internal sleep/retry loops, we need
a way to guarantee deterministic tests and perform asynchronous tasks,
so we also make the loop counts adjustable at runtime (only for use by
tests) and make pg_usleep() interruptable by Windows APC.
These tests are currently expected to pass on NTFS and ReFS. They
mostly pass on SMB, except for everything to do with symlinks, and one
readdir() test (see comments).
---
meson.build | 1 +
src/include/port.h | 2 +
src/port/dirmod.c | 7 +-
src/port/open.c | 5 +-
src/port/pgsleep.c | 2 +-
src/port/t/001_filesystem.c | 850 ++++++++++++++++++++++++++++++++++++
src/port/t/meson.build | 15 +
7 files changed, 878 insertions(+), 4 deletions(-)
create mode 100644 src/port/t/001_filesystem.c
create mode 100644 src/port/t/meson.build
diff --git a/meson.build b/meson.build
index ada774da84..ca9b3ac43c 100644
--- a/meson.build
+++ b/meson.build
@@ -2777,6 +2777,7 @@ subdir('src')
subdir('contrib')
subdir('src/test')
+subdir('src/port/t')
subdir('src/interfaces/libpq/test')
subdir('src/interfaces/ecpg/test')
diff --git a/src/include/port.h b/src/include/port.h
index 69d8818d61..f81f8de2a3 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -269,6 +269,7 @@ extern int pclose_check(FILE *stream);
/*
* Win32 doesn't have reliable rename/unlink during concurrent access.
*/
+extern PGDLLEXPORT int pgwin32_dirmod_loops;
extern int pgrename(const char *from, const char *to);
extern int pgunlink(const char *path);
@@ -307,6 +308,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern PGDLLEXPORT int pgwin32_open_handle_loops;
extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index 5881024929..a2d40f0280 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -41,6 +41,9 @@
#if defined(WIN32) || defined(__CYGWIN__)
+/* Externally visable only to allow testing. */
+int pgwin32_dirmod_loops = 100;
+
/*
* pgrename
*/
@@ -84,7 +87,7 @@ pgrename(const char *from, const char *to)
return -1;
#endif
- if (++loops > 100) /* time out after 10 sec */
+ if (++loops > pgwin32_dirmod_loops) /* time out after 10 sec */
return -1;
pg_usleep(100000); /* us */
}
@@ -137,7 +140,7 @@ pgunlink(const char *path)
{
if (errno != EACCES)
return -1;
- if (++loops > 100) /* time out after 10 sec */
+ if (++loops > pgwin32_dirmod_loops) /* time out after 10 sec */
return -1;
pg_usleep(100000); /* us */
}
diff --git a/src/port/open.c b/src/port/open.c
index fd4faf604e..279a0ac08d 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -25,6 +25,9 @@
#include <assert.h>
#include <sys/stat.h>
+/* This is exported only to support testing. */
+int pgwin32_open_handle_loops = 300;
+
static int
openFlagsToCreateFileFlags(int openFlags)
{
@@ -118,7 +121,7 @@ pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
errhint("You might have antivirus, backup, or similar software interfering with the database system.")));
#endif
- if (loops < 300)
+ if (loops < pgwin32_open_handle_loops)
{
pg_usleep(100000);
loops++;
diff --git a/src/port/pgsleep.c b/src/port/pgsleep.c
index 03f0fac07b..9a37975c5d 100644
--- a/src/port/pgsleep.c
+++ b/src/port/pgsleep.c
@@ -53,7 +53,7 @@ pg_usleep(long microsec)
delay.tv_usec = microsec % 1000000L;
(void) select(0, NULL, NULL, NULL, &delay);
#else
- SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE);
+ SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), TRUE);
#endif
}
}
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
new file mode 100644
index 0000000000..137af164ed
--- /dev/null
+++ b/src/port/t/001_filesystem.c
@@ -0,0 +1,850 @@
+/*
+ * Tests for our partial implementations of POSIX-style filesystem APIs, for
+ * Windows.
+ *
+ * Currently, have_posix_unlink_semantics is expected to be true on all Unix
+ * systems and some Windows 10-based operatings using NTFS, and false on other
+ * Windows ReFS and SMB filesystems.
+ */
+
+#include "c.h"
+
+#include <dirent.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/stat.h>
+
+#include "lib/pg_tap.h"
+#include "port/pg_iovec.h"
+
+/*
+ * Make a path under the tmp_check directory. TESTDATADIR is expected to
+ * contain an absolute path.
+ */
+static void
+make_path(char *buffer, const char *name)
+{
+ const char *directory;
+
+ directory = getenv("TESTDATADIR");
+ PG_REQUIRE(directory);
+
+ snprintf(buffer, MAXPGPATH, "%s/%s", directory, name);
+}
+
+/*
+ * Check if a directory contains a given entry, according to readdir().
+ */
+static bool
+directory_contains(const char *directory_path, const char *name)
+{
+ char directory_full_path[MAXPGPATH];
+ DIR *dir;
+ struct dirent *de;
+ bool saw_name = false;
+
+ make_path(directory_full_path, directory_path);
+ PG_REQUIRE_SYS((dir = opendir(directory_full_path)));
+ errno = 0;
+ while ((de = readdir(dir)))
+ {
+ if (strcmp(de->d_name, ".") == 0 ||
+ strcmp(de->d_name, "..") == 0)
+ continue;
+ if (strcmp(de->d_name, name) == 0)
+ saw_name = true;
+ }
+ PG_REQUIRE_SYS(errno == 0);
+ PG_REQUIRE_SYS(closedir(dir) == 0);
+
+ return saw_name;
+}
+
+#ifdef WIN32
+/*
+ * Make all slashes lean one way, to normalize paths for comparisons on Windows.
+ */
+static void
+normalize_slashes(char *path)
+{
+ while (*path)
+ {
+ if (*path == '/')
+ *path = '\\';
+ path++;
+ }
+}
+
+typedef struct apc_trampoline_data
+{
+ HANDLE timer;
+ void (*function) (void *);
+ void *data;
+} apc_trampoline_data;
+
+static void CALLBACK
+apc_trampoline(void *data, DWORD low, DWORD high)
+{
+ apc_trampoline_data *x = data;
+
+ x->function(x->data);
+ CloseHandle(x->timer);
+ free(data);
+}
+
+/*
+ * Schedule a Windows APC callback. The pg_usleep() call inside the
+ * sleep/retry loops in the wrappers under test is interruptible by APCs,
+ * causing it to run the procedure and return early.
+ */
+static void
+run_async_procedure_after_delay(void (*p) (void *), void *data, int milliseconds)
+{
+ LARGE_INTEGER delay_time = {.LowPart = -10 * milliseconds};
+ apc_trampoline_data *x;
+ HANDLE timer;
+
+ timer = CreateWaitableTimer(NULL, false, NULL);
+ PG_REQUIRE(timer);
+
+ x = malloc(sizeof(*x));
+ PG_REQUIRE_SYS(x);
+ x->timer = timer;
+ x->function = p;
+ x->data = data;
+
+ PG_REQUIRE(SetWaitableTimer(timer, &delay_time, 0, apc_trampoline, x, false));
+}
+
+static void
+close_fd(void *data)
+{
+ int *fd = data;
+
+ PG_REQUIRE_SYS(close(*fd) == 0);
+}
+
+static void
+close_handle(void *data)
+{
+ HANDLE handle = data;
+
+ PG_REQUIRE(CloseHandle(handle));
+}
+#endif
+
+/*
+ * Tests of directory entry manipulation.
+ */
+static void
+filesystem_metadata_tests(void)
+{
+ int fd;
+ int fd2;
+ char path[MAXPGPATH];
+ char path2[MAXPGPATH];
+ char path3[MAXPGPATH];
+ struct stat statbuf;
+ ssize_t size;
+ bool have_posix_unlink_semantics;
+#ifdef WIN32
+ HANDLE handle;
+#endif
+
+ /*
+ * Current versions of Windows 10 have "POSIX semantics" when unlinking
+ * files, meaning that unlink() and rename() remove the directory entry
+ * synchronously, just like Unix. At least some server OSes still seem to
+ * have the traditional Windows behavior, where directory entries remain
+ * in STATUS_DELETE_PENDING state, visible but unopenable, until all file
+ * handles are closed. We have a lot of code paths to deal with the older
+ * asynchronous behavior, and tests for those here. Before we go further,
+ * determine which behavior to expect. Behavior may also vary on non-NTFS
+ * filesystems.
+ */
+ make_path(path, "test.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ PG_REQUIRE_SYS(unlink(path) == 0);
+ fd2 = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd2 >= 0 || errno == EEXIST);
+ if (fd2 >= 0)
+ {
+ /* Expected behavior on Unix and some Windows 10 releases. */
+ PG_EXPECT_SYS(unlink(path) == 0, "POSIX unlink semantics detected: cleaning up");
+ have_posix_unlink_semantics = true;
+ PG_REQUIRE_SYS(close(fd2) == 0);
+ }
+ else
+ {
+ /* Our open wrapper fails with EEXIST for O_EXCL in this case. */
+ PG_EXPECT_SYS(errno == EEXIST,
+ "traditional Windows unlink semantics detected: O_EXCL -> EEXIST");
+ have_posix_unlink_semantics = false;
+ }
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Set up test directory structure. */
+
+ make_path(path, "dir1");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir1/my_subdir");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir1/test.txt");
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ PG_REQUIRE_SYS(write(fd, "hello world\n", 12) == 12);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Tests for symlink()/readlink(). */
+
+ make_path(path, "dir999/my_symlink"); /* name of symlink to create */
+ make_path(path2, "dir1/my_subdir"); /* name of directory it will point to */
+ PG_EXPECT(symlink(path2, path) == -1, "symlink fails on missing parent");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/my_symlink"); /* name of symlink to create */
+ make_path(path2, "dir1/my_subdir"); /* name of directory it will point to */
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "create symlink");
+
+ size = readlink(path, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, strlen(path2), "readlink reports expected size");
+ path3[Max(size, 0)] = '\0';
+#ifdef WIN32
+ normalize_slashes(path2);
+ normalize_slashes(path3);
+#endif
+ PG_EXPECT_EQ_STR(path2, path3, "readlink reports expected target");
+
+ PG_EXPECT(readlink("does-not-exist", path3, sizeof(path3)) == -1, "readlink fails on missing path");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ /* Tests for fstat(). */
+
+ make_path(path, "dir1/test.txt");
+ fd = open(path, O_RDWR, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(fstat(fd, &statbuf) == 0, "fstat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Tests for stat(). */
+
+ PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "stat missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ make_path(path, "dir1/my_subdir");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat directory");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ make_path(path, "dir1/my_symlink");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat symlink");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ /* Tests for lstat(). */
+
+ PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "lstat missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ make_path(path2, "dir1/my_subdir");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path2, &statbuf) == 0, "lstat directory");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ make_path(path, "dir1/my_symlink");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat symlink");
+ PG_EXPECT(S_ISLNK(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, strlen(path2), "got expected symlink size");
+
+ /* Tests for link() and unlink(). */
+
+ make_path(path, "does-not-exist-1");
+ make_path(path2, "does-not-exist-2");
+ PG_EXPECT(link(path, path2) == -1, "link missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ make_path(path2, "dir1/test2.txt");
+ PG_EXPECT_SYS(link(path, path2) == 0, "link succeeds");
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 1 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 2);
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 2 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 2);
+
+ PG_EXPECT_SYS(unlink(path2) == 0, "unlink succeeds");
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 1 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ PG_EXPECT_SYS(lstat(path2, &statbuf) == -1, "lstat link 2 fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ /*
+ * On Windows we have a special code-path to make unlink() work on
+ * junction points created by our symlink() wrapper, to match POSIX
+ * behavior.
+ */
+ make_path(path, "unlink_me_symlink");
+ make_path(path2, "dir1");
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "create symlink");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink symlink");
+
+#ifdef WIN32
+
+ /*
+ * But make sure that it doesn't work on plain old directories.
+ *
+ * POSIX doesn't specify whether unlink() works on a directory, so we
+ * don't check that on non-Windows. In practice almost all known systems
+ * fail with EPERM (POSIX systems) or EISDIR (non-standard error on
+ * Linux), though AIX/JFS1 is rumored to succeed. However, our Windows
+ * emulation doesn't allow it, because we want to avoid surprises by
+ * behaving like nearly all Unix systems. So we check this on Windows
+ * only, where it fails with non-standard EACCES.
+ */
+ PG_EXPECT_SYS(unlink(path2) == -1, "Windows: can't unlink() a directory");
+ PG_EXPECT_EQ(errno, EACCES);
+#endif
+
+#ifdef WIN32
+
+ /*
+ * Test that we automatically retry for a while if we get a sharing
+ * violation because external software does not use FILE_SHARE_* when
+ * opening our files. See similar open() test for explanation.
+ */
+
+ make_path(path, "name1.txt");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(unlink(path) == -1, "Windows: can't unlink file while non-shared handle exists");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(unlink(path) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+#endif
+
+ /* Tests for rename(). */
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt");
+
+ PG_EXPECT_EQ(open(path, O_RDWR | PG_BINARY, 0777), -1, "can't open name1.txt anymore");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt, replacing it");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ fd = open(path2, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open name2.txt");
+ make_path(path2, "name2.txt");
+#ifdef WIN32
+
+ /*
+ * Windows can't rename over an open non-unlinked file, even with
+ * have_posix_unlink_semantics.
+ */
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows: can't rename name1.txt -> name2.txt while name2.txt is open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
+#else
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+#endif
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ /* leave open */
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "can rename name1.txt -> name2.txt while name1.txt is open");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ fd = open(path2, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open name2.txt");
+ PG_EXPECT_SYS(unlink(path2) == 0, "unlink name2.txt while it is still open");
+
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: rename name1.txt -> name2.txt while unlinked file is still open");
+ }
+ else
+ {
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows non-POSIX: cannot rename name1.txt -> name2.txt while unlinked file is still open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_REQUIRE_SYS(unlink(path) == 0);
+ }
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+#ifdef WIN32
+
+ /*
+ * Test that we automatically retry for a while if we get a sharing
+ * violation because external software does not use FILE_SHARE_* when
+ * opening our files. See similar open() test for explanation.
+ */
+
+ make_path(path, "name1.txt");
+ make_path(path2, "name2.txt");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(rename(path, path2) == -1, "Windows: can't rename file while non-shared handle exists for name1");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(rename(path, path2) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ handle = CreateFile(path2, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(rename(path, path2) == -1, "Windows: can't rename file while non-shared handle exists for name2");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(rename(path, path2) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+
+ PG_REQUIRE_SYS(unlink(path2) == 0);
+#endif
+
+ /* Tests for open(). */
+
+ make_path(path, "test.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open(O_CREAT | O_EXCL) succeeds");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_EQ(fd, -1, "open(O_CREAT | O_EXCL) again fails");
+ PG_EXPECT_EQ(errno, EEXIST);
+
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open(O_CREAT) succeeds");
+
+ /*
+ * We can unlink a file that we still have an open handle for on Windows,
+ * because our open() wrapper used FILE_SHARE_DELETE.
+ */
+ PG_EXPECT_SYS(unlink(path) == 0);
+
+ /*
+ * On older Windows, our open wrapper on Windows will see
+ * STATUS_DELETE_PENDING and pretend it can't see the file. On newer
+ * Windows, it won't see the file. Either way matches expected POSIX
+ * behavior.
+ */
+ PG_EXPECT(open(path, O_RDWR | PG_BINARY, 0777) == -1);
+ PG_EXPECT(errno == ENOENT);
+
+ /*
+ * Things are trickier if you asked to create a file. ENOENT doesn't make
+ * sense as a error number to a program expecting POSIX behavior then, so
+ * our wrpaper uses EEXIST for lack of anything more appropriate.
+ */
+ fd2 = open(path, O_RDWR | PG_BINARY | O_CREAT, 0777);
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_SYS(fd2 >= 0, "POSIX: create file with same name, while unlinked file is still open");
+ PG_EXPECT_SYS(close(fd2) == 0);
+ }
+ else
+ {
+ PG_EXPECT_SYS(fd2 == -1,
+ "Windows non-POSIX: cannot create file with same name, while unlinked file is still open");
+ PG_EXPECT_EQ(errno, EEXIST);
+ }
+
+ /* After closing the handle, Windows non-POSIX can now create a new file. */
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+ fd = open(path, O_RDWR | PG_BINARY | O_CREAT, 0777);
+ PG_EXPECT_SYS(fd >= 0,
+ "can create a file with recently unlinked name after closing");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+#ifdef WIN32
+
+ /*
+ * Even on systems new enough to have POSIX unlink behavior, the problem
+ * of sharing violations still applies.
+ *
+ * Our own open() wrapper is careful to use the FILE_SHARE_* flags to
+ * avoid the dreaded ERROR_SHARING_VIOLATION, but it's possible that
+ * another program might open one of our files without them. In that
+ * case, our open() wrapper will sleep and retry for a long time, in the
+ * hope that other program goes away. So let's open a handle with
+ * antisocial flags. We'll adjust the loop count via a side-hatch so we
+ * don't waste time in this test, though.
+ */
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_open_handle_loops = 2; /* minimize retries so test is fast */
+ PG_EXPECT(open(path, O_RDWR | PG_BINARY, 0777) == -1, "Windows: can't open file while non-shared handle exists");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_open_handle_loops = 18000; /* 180s must be long enough for our
+ * async callback to run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "Windows: can open file after non-shared handle asynchronously closed");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+#endif
+
+ PG_EXPECT_SYS(unlink(path) == 0);
+
+ /* Tests for opendir(), readdir(), closedir(). */
+ {
+ DIR *dir;
+ struct dirent *de;
+ int dot = -1;
+ int dotdot = -1;
+ int my_subdir = -1;
+ int my_symlink = -1;
+ int test_txt = -1;
+
+ make_path(path, "does-not-exist");
+ PG_EXPECT(opendir(path) == NULL, "open missing directory fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1");
+ PG_EXPECT_SYS((dir = opendir(path)), "open directory");
+
+#ifdef DT_REG
+/*
+ * Linux, *BSD, macOS and our Windows wrappers have BSD d_type. On a few rare
+ * file systems, it may be reported as DT_UNKNOWN so we have to tolerate that too.
+ */
+#define LOAD(name, variable) if (strcmp(de->d_name, name) == 0) variable = de->d_type
+#define CHECK(name, variable, type) \
+ PG_EXPECT(variable != -1, name " was found"); \
+ PG_EXPECT(variable == DT_UNKNOWN || variable == type, name " has type DT_UNKNOWN or " #type)
+#else
+/*
+ * Solaris, AIX and Cygwin do not have it (it's not in POSIX). Just check that
+ * we saw the file and ignore the type.
+ */
+#define LOAD(name, variable) if (strcmp(de->d_name, name) == 0) variable = 0
+#define CHECK(name, variable, type) \
+ PG_EXPECT(variable != -1, name " was found")
+#endif
+
+ /* Load and check in two phases because the order is unknown. */
+ errno = 0;
+ while ((de = readdir(dir)))
+ {
+ LOAD(".", dot);
+ LOAD("..", dotdot);
+ LOAD("my_subdir", my_subdir);
+ LOAD("my_symlink", my_symlink);
+ LOAD("test.txt", test_txt);
+ }
+ PG_EXPECT_SYS(errno == 0, "ran out of dirents without error");
+
+ CHECK(".", dot, DT_DIR);
+ CHECK("..", dotdot, DT_DIR);
+ CHECK("my_subdir", my_subdir, DT_DIR);
+ CHECK("my_symlink", my_symlink, DT_LNK);
+ CHECK("test.txt", test_txt, DT_REG);
+
+#undef LOAD
+#undef CHECK
+ }
+
+ /*
+ * On older Windows, readdir() sees entries that are in
+ * STATUS_DELETE_PENDING state, and they prevent the directory itself from
+ * being unlinked. Demonstrate that difference here.
+ */
+ make_path(path, "dir2");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir2/test.txt");
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink file while still open");
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT(!directory_contains("dir2", "test.txt"), "POSIX: readdir doesn't see it before closing");
+ }
+ else
+ {
+ /*
+ * This test fails on Windows SMB filesystems (AKA network drives):
+ * test.txt is not visible to our readdir() wrapper, because SMB seems
+ * to hide STATUS_DELETE_PENDING files.
+ */
+ PG_EXPECT(directory_contains("dir2", "test.txt"), "Windows non-POSIX: readdir still sees it before closing");
+ }
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+ PG_EXPECT(!directory_contains("dir2", "test.txt"), "readdir doesn't see it after closing");
+}
+
+/*
+ * Tests for pread, pwrite, and vector variants.
+ */
+static void
+filesystem_io_tests(void)
+{
+ int fd;
+ char path[MAXPGPATH];
+ char buffer[80];
+ char buffer2[80];
+ struct iovec iov[8];
+
+ /*
+ * These are our own wrappers on Windows. On Unix, they're just macros
+ * for system APIs, except on Solaris which shares the pg_{read,write}v
+ * replacements used on Windows.
+ */
+
+ make_path(path, "io_test_file.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "create a new file");
+ PG_REQUIRE(fd >= 0);
+
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "initial file position is zero");
+
+ PG_EXPECT_EQ(pg_pwrite(fd, "llo", 3, 2), 3);
+ PG_EXPECT_EQ(pg_pwrite(fd, "he", 2, 0), 2);
+
+ /*
+ * On Windows, the current position moves, which is why we have the pg_
+ * prefixes as a warning to users. Demonstrate that difference here.
+ */
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 2, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ memset(buffer, 0, sizeof(buffer));
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 2, 3), 2);
+ PG_EXPECT(memcmp(buffer, "lo", 2) == 0);
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 3, 0), 3);
+ PG_EXPECT(memcmp(buffer, "hel", 3) == 0);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 3, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 80, 0), 5, "pg_pread short read");
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 80, 5), 0, "pg_pread EOF");
+
+ iov[0].iov_base = "wo";
+ iov[0].iov_len = 2;
+ iov[1].iov_base = "rld";
+ iov[1].iov_len = 3;
+ PG_EXPECT_EQ(pg_pwritev(fd, iov, 2, 5), 5);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 10, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ memset(buffer, 0, sizeof(buffer));
+ memset(buffer2, 0, sizeof(buffer2));
+ iov[0].iov_base = buffer;
+ iov[0].iov_len = 4;
+ iov[1].iov_base = buffer2;
+ iov[1].iov_len = 4;
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 1), 8);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 9, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ PG_EXPECT(memcmp(buffer, "ello", 4) == 0);
+ PG_EXPECT(memcmp(buffer2, "worl", 4) == 0);
+
+ memset(buffer, 0, sizeof(buffer));
+ memset(buffer2, 0, sizeof(buffer2));
+ iov[0].iov_base = buffer;
+ iov[0].iov_len = 1;
+ iov[1].iov_base = buffer2;
+ iov[1].iov_len = 80;
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 8), 2);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 9), 1);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 10), 0);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 11), 0);
+
+ memset(buffer, 0, sizeof(buffer));
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 10, 0), 10);
+ PG_EXPECT_EQ_STR(buffer, "helloworld");
+
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Demonstrate the effects of "text" mode (no PG_BINARY flag). */
+
+ /* Write out a message in text mode. */
+ fd = open(path, O_RDWR, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(write(fd, "hello world\n", 12), 12);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ /* Read it back in binary mode, to reveal the translation. */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in binary mode");
+ PG_REQUIRE(fd >= 0);
+#ifdef WIN32
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 13,
+ "Windows: \\n was translated");
+ PG_EXPECT(memcmp(buffer, "hello world\r\n", 13) == 0,
+ "Windows: \\n was translated to \\r\\n");
+#else
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 12,
+ "POSIX: \\n was not translated");
+ PG_EXPECT(memcmp(buffer, "hello world\n", 12) == 0,
+ "POSIX: \\n is \\n");
+#endif
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* The opposite translation happens in text mode, hiding it. */
+ fd = open(path, O_RDWR, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 12);
+ PG_EXPECT(memcmp(buffer, "hello world\n", 12) == 0);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+
+ /*
+ * Write out a message in text mode, this time with pg_pwrite(), which
+ * does not suffer from newline translation.
+ */
+ fd = open(path, O_RDWR | O_CREAT, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(pg_pwrite(fd, "hello world\n", 12, 0), 12);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Read it back in binary mode to verify that. */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in binary mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(pg_pread(fd, buffer, sizeof(buffer), 0), 12,
+ "\\n was not translated by pg_pread()");
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+}
+
+/*
+ * Exercise fsync and fdatasync.
+ */
+static void
+filesystem_sync_tests(void)
+{
+ int fd;
+ char path[MAXPGPATH];
+
+ make_path(path, "sync_me.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+
+ PG_REQUIRE_SYS(write(fd, "x", 1) == 1);
+ PG_EXPECT_SYS(fsync(fd) == 0);
+
+ PG_REQUIRE_SYS(write(fd, "x", 1) == 1);
+ PG_EXPECT_SYS(fdatasync(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+}
+
+int
+main()
+{
+ PG_BEGIN_TESTS();
+
+ filesystem_metadata_tests();
+ filesystem_io_tests();
+ filesystem_sync_tests();
+
+ PG_END_TESTS();
+}
diff --git a/src/port/t/meson.build b/src/port/t/meson.build
new file mode 100644
index 0000000000..bc5c7fc0b8
--- /dev/null
+++ b/src/port/t/meson.build
@@ -0,0 +1,15 @@
+port_tests = []
+port_tests += executable('001_filesystem',
+ ['001_filesystem.c'],
+ dependencies: [frontend_code],
+ kwargs: test_bin_args,
+)
+
+tests += {
+ 'name': 'port',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'exec_tap': {
+ 'tests': port_tests,
+ },
+}
--
2.35.1
v2-0006-Fix-lstat-on-broken-junction-points.patchtext/x-patch; charset=US-ASCII; name=v2-0006-Fix-lstat-on-broken-junction-points.patchDownload
From 46ffa9002693ad468f3c569cf3cb2a891de308ae Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 16 Oct 2022 12:07:33 +1300
Subject: [PATCH v2 06/10] Fix lstat() on broken junction points.
When using junction points to emulate symlinks on Windows, one edge case
was not handled correctly by commit c5cb8f3b: if a junction point is
broken (pointing to a non-existent path), we'd report ENOENT. This
doesn't break any known use case, but was noticed while testing and is
fixed here for completeness.
Also add translation ERROR_CANT_RESOLVE_FILENAME -> ENOENT, as that is
one of the errors Windows can report depending on format of the broken
path.
---
src/port/t/001_filesystem.c | 23 +++++++++++++++--------
src/port/win32error.c | 3 +++
src/port/win32stat.c | 27 ++++++++++++++++++++++-----
3 files changed, 40 insertions(+), 13 deletions(-)
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 137af164ed..bb1c681366 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -277,6 +277,13 @@ filesystem_metadata_tests(void)
PG_EXPECT(S_ISLNK(statbuf.st_mode));
PG_EXPECT_EQ(statbuf.st_size, strlen(path2), "got expected symlink size");
+ make_path(path, "broken-symlink");
+ make_path(path2, "does-not-exist");
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "make a broken symlink");
+ PG_EXPECT_SYS(lstat(path, &statbuf) == 0, "lstat broken symlink");
+ PG_EXPECT(S_ISLNK(statbuf.st_mode));
+ PG_EXPECT_SYS(unlink(path) == 0);
+
/* Tests for link() and unlink(). */
make_path(path, "does-not-exist-1");
@@ -347,7 +354,7 @@ filesystem_metadata_tests(void)
fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_REQUIRE_SYS(close(fd) == 0);
handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
PG_REQUIRE(handle);
@@ -368,7 +375,7 @@ filesystem_metadata_tests(void)
make_path(path, "name1.txt");
fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
make_path(path2, "name2.txt");
PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt");
@@ -378,14 +385,14 @@ filesystem_metadata_tests(void)
fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
make_path(path2, "name2.txt");
PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt, replacing it");
fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
fd = open(path2, O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "open name2.txt");
@@ -414,12 +421,12 @@ filesystem_metadata_tests(void)
make_path(path2, "name2.txt");
PG_EXPECT_SYS(rename(path, path2) == 0,
"can rename name1.txt -> name2.txt while name1.txt is open");
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
make_path(path, "name1.txt");
fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
fd = open(path2, O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "open name2.txt");
@@ -435,9 +442,9 @@ filesystem_metadata_tests(void)
PG_EXPECT_SYS(rename(path, path2) == -1,
"Windows non-POSIX: cannot rename name1.txt -> name2.txt while unlinked file is still open");
PG_EXPECT_EQ(errno, EACCES);
- PG_REQUIRE_SYS(unlink(path) == 0);
+ PG_EXPECT_SYS(unlink(path) == 0);
}
- PG_REQUIRE_SYS(close(fd) == 0);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
#ifdef WIN32
diff --git a/src/port/win32error.c b/src/port/win32error.c
index a78d323827..67ce805d77 100644
--- a/src/port/win32error.c
+++ b/src/port/win32error.c
@@ -167,6 +167,9 @@ static const struct
},
{
ERROR_INVALID_NAME, ENOENT
+ },
+ {
+ ERROR_CANT_RESOLVE_FILENAME, ENOENT
}
};
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 5f3d0d22ff..ce8d87093d 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -125,15 +125,30 @@ _pglstat64(const char *name, struct stat *buf)
hFile = pgwin32_open_handle(name, O_RDONLY, true);
if (hFile == INVALID_HANDLE_VALUE)
- return -1;
-
- ret = fileinfo_to_stat(hFile, buf);
+ {
+ if (errno == ENOENT)
+ {
+ /*
+ * If it's a junction point pointing to a non-existent path, we'll
+ * have ENOENT here (because pgwin32_open_handle does not use
+ * FILE_FLAG_OPEN_REPARSE_POINT). In that case, we'll try again
+ * with readlink() below, which will distinguish true ENOENT from
+ * pseudo-symlink.
+ */
+ memset(buf, 0, sizeof(*buf));
+ ret = 0;
+ }
+ else
+ return -1;
+ }
+ else
+ ret = fileinfo_to_stat(hFile, buf);
/*
* Junction points appear as directories to fileinfo_to_stat(), so we'll
* need to do a bit more work to distinguish them.
*/
- if (ret == 0 && S_ISDIR(buf->st_mode))
+ if ((ret == 0 && S_ISDIR(buf->st_mode)) || hFile == INVALID_HANDLE_VALUE)
{
char next[MAXPGPATH];
ssize_t size;
@@ -169,10 +184,12 @@ _pglstat64(const char *name, struct stat *buf)
buf->st_mode &= ~S_IFDIR;
buf->st_mode |= S_IFLNK;
buf->st_size = size;
+ ret = 0;
}
}
- CloseHandle(hFile);
+ if (hFile != INVALID_HANDLE_VALUE)
+ CloseHandle(hFile);
return ret;
}
--
2.35.1
v2-0007-Fix-readlink-for-non-PostgreSQL-created-junction-.patchtext/x-patch; charset=US-ASCII; name=v2-0007-Fix-readlink-for-non-PostgreSQL-created-junction-.patchDownload
From 48fef7c3c7bfdafb56d71bd474e0282e1253bc7f Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Thu, 11 Aug 2022 10:42:13 +1200
Subject: [PATCH v2 07/10] Fix readlink() for non-PostgreSQL-created junction
points.
Our symlink() and readlink() replacements perform a naive transformation
from DOS paths to NT paths and back, as required by the junction point
APIs. This was enough for the "drive absolute" paths we expect users to
provide for tablespaces, for example "d:\my\fast\storage".
Since commit c5cb8f3b taught stat() to follow symlinks, and since initdb
uses pg_mkdir_p(), and that examines parent directories, our humble
readlink() implementation can now be exposed to junction points not of
PostgreSQL origin. Those might be corrupted by our naive path mangling,
which doesn't really understand NT paths in general.
Simply decline to transform paths that don't look like a drive absolute
path. That means that readlink() returns the NT path directly when
checking a parent directory of PGDATA that happen to point to a drive
using "rooted" format. That works for the purposes of our stat()
emulation.
Reported-by: Roman Zharkov <r.zharkov@postgrespro.ru>
Reviewed-by: Roman Zharkov <r.zharkov@postgrespro.ru>
Discussion: https://postgr.es/m/4590c37927d7b8ee84f9855d83229018%40postgrespro.ru
---
src/port/dirmod.c | 17 ++++++++++---
src/port/t/001_filesystem.c | 51 +++++++++++++++++++++++++++++++++++++
2 files changed, 65 insertions(+), 3 deletions(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index a2d40f0280..cf06878e1c 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -366,10 +366,21 @@ pgreadlink(const char *path, char *buf, size_t size)
r -= 1;
/*
- * If the path starts with "\??\", which it will do in most (all?) cases,
- * strip those out.
+ * If the path starts with "\??\" followed by a "drive absolute" path
+ * (known to Windows APIs as RtlPathTypeDriveAbsolute), then strip that
+ * prefix. This undoes some of the transformation performed by
+ * pqsymlink(), to get back to a format that users are used to seeing. We
+ * don't know how to transform other path types that might be encountered
+ * outside PGDATA, so we just return them directly.
*/
- if (r > 4 && strncmp(buf, "\\??\\", 4) == 0)
+ if (r >= 7 &&
+ buf[0] == '\\' &&
+ buf[1] == '?' &&
+ buf[2] == '?' &&
+ buf[3] == '\\' &&
+ isalpha(buf[4]) &&
+ buf[5] == ':' &&
+ buf[6] == '\\')
{
memmove(buf, buf + 4, strlen(buf + 4) + 1);
r -= 4;
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index bb1c681366..de038a86f6 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -220,6 +220,57 @@ filesystem_metadata_tests(void)
PG_EXPECT(readlink("does-not-exist", path3, sizeof(path3)) == -1, "readlink fails on missing path");
PG_EXPECT_EQ(errno, ENOENT);
+ /*
+ * Checks that we don't corrupt non-drive-absolute paths when peforming
+ * internal conversions.
+ */
+
+ /*
+ * Typical case: Windows drive absolute. This should also be accepted on
+ * POSIX systems, because they are required not to validate the target
+ * string as a path.
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("c:\\foo", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, 6);
+ path3[Max(size, 0)] = '\0';
+ PG_EXPECT_EQ_STR(path3, "c:\\foo");
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
+ /*
+ * Drive absolute given in full NT format will be stripped on round-trip
+ * through our Windows emulations.
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("\\??\\c:\\foo", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ path3[Max(size, 0)] = '\0';
+#ifdef WIN32
+ PG_EXPECT_EQ(size, 6);
+ PG_EXPECT_EQ_STR(path3, "c:\\foo");
+#else
+ PG_EXPECT_EQ(size, 10);
+ PG_EXPECT_EQ_STR(path3, "\\??\\c:\\foo");
+#endif
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
+ /*
+ * Anything that doesn't look like the NT pattern that symlink() creates
+ * will be returned verbatim. This will allow our stat() to handle paths
+ * that were not created by symlink().
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("\\??\\Volume1234", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, 14);
+ path3[Max(size, 0)] = '\0';
+ PG_EXPECT_EQ_STR(path3, "\\??\\Volume1234");
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
/* Tests for fstat(). */
make_path(path, "dir1/test.txt");
--
2.35.1
v2-0008-Fix-stat-for-recursive-junction-points-on-Windows.patchtext/x-patch; charset=US-ASCII; name=v2-0008-Fix-stat-for-recursive-junction-points-on-Windows.patchDownload
From 9366413d3ae3cc5abc9fe7546549ca08d6add7ee Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 11 Aug 2022 12:30:48 +1200
Subject: [PATCH v2 08/10] Fix stat() for recursive junction points on Windows.
Commit c5cb8f3b supposed that we'd only ever have to follow one junction
point in stat(), because we don't construct longer chains of them ourselves.
When examining a parent directory supplied by the user, we should really be
able to cope with longer chains, just in case someone has their system
set up that way. Choose an arbitrary cap of 8, to match the minimum
acceptable value of SYMLOOP_MAX in POSIX.
Previously I'd avoided reporting ELOOP thinking Windows didn't have it,
but it turns out that it does.
Reviewed-by: Roman Zharkov <r.zharkov@postgrespro.ru>
Discussion: https://postgr.es/m/CA%2BhUKGJ7JDGWYFt9%3D-TyJiRRy5q9TtPfqeKkneWDr1XPU1%2Biqw%40mail.gmail.com
---
src/port/t/001_filesystem.c | 55 +++++++++++++++++++++++++++++++++++++
src/port/win32stat.c | 26 +++++++++---------
2 files changed, 68 insertions(+), 13 deletions(-)
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index de038a86f6..35abeb772b 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -305,6 +305,61 @@ filesystem_metadata_tests(void)
PG_EXPECT(stat(path, &statbuf) == 0, "stat symlink");
PG_EXPECT(S_ISDIR(statbuf.st_mode));
+ /* Recursive symlinks. */
+ make_path(path, "dir1");
+ make_path(path2, "sym001");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym001 -> dir1");
+ make_path(path, "sym001");
+ make_path(path2, "sym002");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym002 -> sym001");
+ make_path(path, "sym002");
+ make_path(path2, "sym003");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym003 -> sym002");
+ make_path(path, "sym003");
+ make_path(path2, "sym004");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym004 -> sym003");
+ make_path(path, "sym004");
+ make_path(path2, "sym005");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym005 -> sym004");
+ make_path(path, "sym005");
+ make_path(path2, "sym006");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym006 -> sym005");
+ make_path(path, "sym006");
+ make_path(path2, "sym007");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym007 -> sym006");
+ make_path(path, "sym007");
+ make_path(path2, "sym008");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym008 -> sym007");
+ make_path(path, "sym008");
+ make_path(path2, "sym009");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym009 -> sym008");
+
+ /* POSIX says SYMLOOP_MAX should be at least 8. */
+ make_path(path, "sym008");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT_SYS(stat(path, &statbuf) == 0, "stat sym008");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+#ifdef WIN32
+
+ /*
+ * Test ELOOP failure in our Windows implementation of stat(), because we
+ * know it gives up after 8.
+ */
+ make_path(path, "sym009");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == -1, "Windows: stat sym009 fails");
+ PG_EXPECT_EQ(errno, ELOOP);
+#endif
+
+ /* If we break the chain we get ENOENT. */
+ make_path(path, "sym003");
+ PG_EXPECT_SYS(unlink(path) == 0);
+ make_path(path, "sym008");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == -1, "stat broken symlink chain fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
/* Tests for lstat(). */
PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "lstat missing file fails");
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index ce8d87093d..e6553e4030 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -199,23 +199,33 @@ _pglstat64(const char *name, struct stat *buf)
int
_pgstat64(const char *name, struct stat *buf)
{
+ int loops = 0;
int ret;
+ char curr[MAXPGPATH];
ret = _pglstat64(name, buf);
+ strlcpy(curr, name, MAXPGPATH);
+
/* Do we need to follow a symlink (junction point)? */
- if (ret == 0 && S_ISLNK(buf->st_mode))
+ while (ret == 0 && S_ISLNK(buf->st_mode))
{
char next[MAXPGPATH];
ssize_t size;
+ if (++loops > 8)
+ {
+ errno = ELOOP;
+ return -1;
+ }
+
/*
* _pglstat64() already called readlink() once to be able to fill in
* st_size, and now we need to do it again to get the path to follow.
* That could be optimized, but stat() on symlinks is probably rare
* and this way is simple.
*/
- size = readlink(name, next, sizeof(next));
+ size = readlink(curr, next, sizeof(next));
if (size < 0)
{
if (errno == EACCES &&
@@ -234,17 +244,7 @@ _pgstat64(const char *name, struct stat *buf)
next[size] = 0;
ret = _pglstat64(next, buf);
- if (ret == 0 && S_ISLNK(buf->st_mode))
- {
- /*
- * We're only prepared to go one hop, because we only expect to
- * deal with the simple cases that we create. The error for too
- * many symlinks is supposed to be ELOOP, but Windows hasn't got
- * it.
- */
- errno = EIO;
- return -1;
- }
+ strcpy(curr, next);
}
return ret;
--
2.35.1
v2-0009-Fix-unlink-for-STATUS_DELETE_PENDING-on-Windows.patchtext/x-patch; charset=US-ASCII; name=v2-0009-Fix-unlink-for-STATUS_DELETE_PENDING-on-Windows.patchDownload
From 2913435d08ae49019e1338fd0563fedae63069cb Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 3 Oct 2022 09:15:02 +1300
Subject: [PATCH v2 09/10] Fix unlink() for STATUS_DELETE_PENDING on Windows.
Commit c5cb8f3b didn't handle STATUS_DELETE_PENDING correctly, and would
report ENOENT immediately without waiting. If we called unlink(name)
twice in a row and then rmdir(parent) while someone had an open handle,
previously the second call to unlink() would block until that handle
went away, or a 10 second timeout expired. This change resulted in a
test occasionally failing on CI, which still has an OS without POSIX
semantics for unlink. Restore.
Diagnosed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20220920013122.GA31833%40telsasoft.com
---
src/port/dirmod.c | 38 +++++++++++++++++++++++++++++++++++--
src/port/t/001_filesystem.c | 36 +++++++++++++++++++++++++++++++++++
2 files changed, 72 insertions(+), 2 deletions(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index cf06878e1c..e377d46782 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -39,6 +39,10 @@
#endif
#endif
+#if defined(WIN32) && !defined(__CYGWIN__)
+#include "port/win32ntdll.h"
+#endif
+
#if defined(WIN32) || defined(__CYGWIN__)
/* Externally visable only to allow testing. */
@@ -94,6 +98,22 @@ pgrename(const char *from, const char *to)
return 0;
}
+/*
+ * Check if _pglstat64()'s reason for failure was STATUS_DELETE_PENDING.
+ * This doesn't apply to Cygwin, which has its own lstat() that would report
+ * the case as EACCES.
+*/
+static bool
+lstat_error_was_status_delete_pending(void)
+{
+ if (errno != ENOENT)
+ return false;
+#if defined(WIN32) && !defined(__CYGWIN__)
+ if (pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
+ return true;
+#endif
+ return false;
+}
/*
* pgunlink
@@ -101,6 +121,7 @@ pgrename(const char *from, const char *to)
int
pgunlink(const char *path)
{
+ bool is_lnk;
int loops = 0;
struct stat st;
@@ -125,9 +146,22 @@ pgunlink(const char *path)
* due to sharing violations, but that seems unlikely. We could perhaps
* prevent that by holding a file handle ourselves across the lstat() and
* the retry loop, but that seems like over-engineering for now.
+ *
+ * In the special case of a STATUS_DELETE_PENDING error (file already
+ * unlinked, but someone still has it open), we don't want to report ENOENT
+ * to the caller immediately, because rmdir(parent) would probably fail.
+ * We want to wait until the file truly goes away so that simple recursive
+ * directory unlink algorithms work.
*/
if (lstat(path, &st) < 0)
- return -1;
+ {
+ if (lstat_error_was_status_delete_pending())
+ is_lnk = false;
+ else
+ return -1;
+ }
+ else
+ is_lnk = S_ISLNK(st.st_mode);
/*
* We need to loop because even though PostgreSQL uses flags that allow
@@ -136,7 +170,7 @@ pgunlink(const char *path)
* someone else to close the file, as the caller might be holding locks
* and blocking other backends.
*/
- while ((S_ISLNK(st.st_mode) ? rmdir(path) : unlink(path)) < 0)
+ while ((is_lnk ? rmdir(path) : unlink(path)) < 0)
{
if (errno != EACCES)
return -1;
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 35abeb772b..680e2353d7 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -476,6 +476,42 @@ filesystem_metadata_tests(void)
PG_EXPECT_SYS(unlink(path) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
#endif
+ /*
+ * Our Windows unlink() wrapper blocks in a retry loop if you try to
+ * unlink a file in STATUS_DELETE_PENDING (ie that has already been
+ * unlinked but is still open), until it times out with EACCES or reaches
+ * ENOENT. That may be useful for waiting for files to be asynchronously
+ * unlinked while performing a recursive unlink on
+ * !have_posix_unlink_semantics systems, so that rmdir(parent) works.
+ */
+ make_path(path, "dir2");
+ PG_EXPECT_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir2/test-file");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open dir2/test-file");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink file while it's open, once");
+#ifdef WIN32
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+#endif
+ PG_EXPECT(unlink(path) == -1, "can't unlink again");
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_EQ(errno, ENOENT, "POSIX: we expect ENOENT");
+ }
+ else
+ {
+ PG_EXPECT_EQ(errno, EACCES, "Windows non-POSIX: we expect EACCES (delete already pending)");
+#ifdef WIN32
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure
+ * our 100ms callback is run */
+ run_async_procedure_after_delay(close_fd, &fd, 100); /* close fd after 100ms */
+#endif
+ PG_EXPECT(unlink(path) == -1, "Windows non-POSIX: trying again fails");
+ PG_EXPECT_EQ(errno, ENOENT, "Windows non-POSIX: ... but blocked until ENOENT was reached due to asynchronous close");
+ }
+ make_path(path, "dir2");
+ PG_EXPECT_SYS(rmdir(path) == 0, "now we can remove the directory");
+
/* Tests for rename(). */
make_path(path, "name1.txt");
--
2.35.1
v2-0010-Use-POSIX-semantics-for-unlink-and-rename-on-Wind.patchtext/x-patch; charset=US-ASCII; name=v2-0010-Use-POSIX-semantics-for-unlink-and-rename-on-Wind.patchDownload
From 3e50fdc112cc988c391177f8ff7254ba3f3703d0 Mon Sep 17 00:00:00 2001
From: Thomas <thomas.munro@gmail.com>
Date: Mon, 17 Oct 2022 22:41:18 -0700
Subject: [PATCH v2 10/10] Use POSIX semantics for unlink() and rename() on
Windows.
Use SetInformationByHandle() directly to implement unlink and rename
with POSIX semantics.
On ReFS and SMB, this falls back to interfaces with non-POSIX semantics.
Author: Victor Spirin <v.spirin@postgrespro.ru>
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/a529b660-da15-5b62-21a0-9936768210fd%40postgrespro.ru
---
src/port/dirmod.c | 239 +++++++++++++++++++++++++++---------
src/port/t/001_filesystem.c | 35 +++---
2 files changed, 197 insertions(+), 77 deletions(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index e377d46782..3ba13e037e 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -39,15 +39,125 @@
#endif
#endif
-#if defined(WIN32) && !defined(__CYGWIN__)
-#include "port/win32ntdll.h"
-#endif
-
#if defined(WIN32) || defined(__CYGWIN__)
/* Externally visable only to allow testing. */
int pgwin32_dirmod_loops = 100;
+#ifdef WIN32
+
+/*
+ * XXX Because mingw doesn't yet define struct FILE_RENAME_INFO with the Flags
+ * member, we'll define a layout-compatible struct ourselves for now. See:
+ *
+ * https://learn.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_rename_info
+ */
+typedef struct XXX_FILE_RENAME_INFO
+{
+ union
+ {
+ BOOLEAN ReplaceIfExists;
+ DWORD Flags;
+ } DUMMYUNIONNAME;
+ HANDLE RootDirectory;
+ DWORD FileNameLength;
+ WCHAR FileName[1];
+} XXX_FILE_RENAME_INFO;
+
+/*
+ * XXX Because mingw seems to believe we need a higher _WIN32_WINNT than the
+ * Windows SDK requires for some of these macros, define them ourselves if
+ * necessary.
+ */
+#ifndef FILE_RENAME_FLAG_REPLACE_IF_EXISTS
+#define FILE_RENAME_FLAG_REPLACE_IF_EXISTS 0x00000001
+#endif
+#ifndef FILE_RENAME_FLAG_POSIX_SEMANTICS
+#define FILE_RENAME_FLAG_POSIX_SEMANTICS 0x00000002
+#endif
+#ifndef FILE_DISPOSITION_DELETE
+#define FILE_DISPOSITION_DELETE 0x00000001
+#endif
+#ifndef FILE_DISPOSITION_POSIX_SEMANTICS
+#define FILE_DISPOSITION_POSIX_SEMANTICS 0x00000002
+#endif
+/* Can't use macro tricks for FILE_INFO_BY_HANDLE_CLASS enumator names. */
+#define XXX_FileDispositionInfoEx 0x15
+#define XXX_FileRenameInfoEx 0x16
+
+/*
+ * A container for FILE_RENAME_INFO that adds trailing space for FileName.
+ */
+typedef struct FILE_RENAME_INFO_EXT
+{
+ XXX_FILE_RENAME_INFO fri;
+ wchar_t extra_space[MAXPGPATH];
+} FILE_RENAME_INFO_EXT;
+
+static int
+pgwin32_posix_rename(const char *from, const char *to)
+{
+ FILE_RENAME_INFO_EXT rename_info = {{.Flags = 0}};
+ HANDLE handle;
+
+ if (MultiByteToWideChar(CP_ACP, 0, to, -1, rename_info.fri.FileName, MAXPGPATH) == 0)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ rename_info.fri.Flags = FILE_RENAME_FLAG_POSIX_SEMANTICS | FILE_RENAME_FLAG_REPLACE_IF_EXISTS;
+ rename_info.fri.FileNameLength = wcslen(rename_info.fri.FileName);
+
+ handle = CreateFile(from,
+ GENERIC_READ | GENERIC_WRITE | DELETE,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+ if (handle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ if (!SetFileInformationByHandle(handle,
+ XXX_FileRenameInfoEx,
+ &rename_info,
+ sizeof(FILE_RENAME_INFO_EXT)))
+ {
+ DWORD error = GetLastError();
+
+ /*
+ * ReFS currently fails, so we'll try again without POSIX semantics.
+ * Likewise for SMB, except it helpfully fails with a different more
+ * general error.
+ */
+ if (error == ERROR_NOT_SUPPORTED || error == ERROR_INVALID_PARAMETER)
+ {
+ /* Try the older FileRenameInfo (no "Ex", no Flags). */
+ rename_info.fri.ReplaceIfExists = true;
+ if (!SetFileInformationByHandle(handle, FileRenameInfo, &rename_info,
+ sizeof(FILE_RENAME_INFO_EXT)))
+ {
+ _dosmaperr(GetLastError());
+ CloseHandle(handle);
+ return -1;
+ }
+ }
+ else
+ {
+ _dosmaperr(error);
+ CloseHandle(handle);
+ return -1;
+ }
+ }
+ CloseHandle(handle);
+ return 0;
+}
+
+#endif
+
/*
* pgrename
*/
@@ -64,7 +174,7 @@ pgrename(const char *from, const char *to)
* and blocking other backends.
*/
#if defined(WIN32) && !defined(__CYGWIN__)
- while (!MoveFileEx(from, to, MOVEFILE_REPLACE_EXISTING))
+ while (pgwin32_posix_rename(from, to) < 0)
#else
while (rename(from, to) < 0)
#endif
@@ -98,70 +208,75 @@ pgrename(const char *from, const char *to)
return 0;
}
-/*
- * Check if _pglstat64()'s reason for failure was STATUS_DELETE_PENDING.
- * This doesn't apply to Cygwin, which has its own lstat() that would report
- * the case as EACCES.
-*/
-static bool
-lstat_error_was_status_delete_pending(void)
-{
- if (errno != ENOENT)
- return false;
#if defined(WIN32) && !defined(__CYGWIN__)
- if (pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
- return true;
-#endif
- return false;
+
+static int
+pgwin32_posix_unlink(const char *path)
+{
+ BY_HANDLE_FILE_INFORMATION info;
+ HANDLE handle;
+ ULONG flags;
+
+ flags = FILE_DISPOSITION_DELETE | FILE_DISPOSITION_POSIX_SEMANTICS;
+ handle = CreateFile(path,
+ GENERIC_READ | GENERIC_WRITE | DELETE,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+ if (handle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ if (!GetFileInformationByHandle(handle, &info))
+ {
+ _dosmaperr(GetLastError());
+ CloseHandle(handle);
+ return -1;
+ }
+ /* Let junction points be unlinked this way, but not directories. */
+ if ((info.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) &&
+ !(info.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT))
+ {
+ CloseHandle(handle);
+ errno = EPERM;
+ return -1;
+ }
+ if (!SetFileInformationByHandle(handle,
+ XXX_FileDispositionInfoEx,
+ &flags,
+ sizeof(flags)))
+ {
+ _dosmaperr(GetLastError());
+
+ if (errno == EINVAL)
+ {
+ /*
+ * SMB filesystems fail like this. Fall back to (presumably)
+ * non-POSIX variant via C library.
+ */
+ CloseHandle(handle);
+ return unlink(path);
+ }
+
+ CloseHandle(handle);
+ return -1;
+ }
+ CloseHandle(handle);
+ return 0;
}
+#endif
+
/*
* pgunlink
*/
int
pgunlink(const char *path)
{
- bool is_lnk;
int loops = 0;
- struct stat st;
-
- /*
- * This function might be called for a regular file or for a junction
- * point (which we use to emulate symlinks). The latter must be unlinked
- * with rmdir() on Windows. Before we worry about any of that, let's see
- * if we can unlink directly, since that's expected to be the most common
- * case.
- */
- if (unlink(path) == 0)
- return 0;
- if (errno != EACCES)
- return -1;
-
- /*
- * EACCES is reported for many reasons including unlink() of a junction
- * point. Check if that's the case so we can redirect to rmdir().
- *
- * Note that by checking only once, we can't cope with a path that changes
- * from regular file to junction point underneath us while we're retrying
- * due to sharing violations, but that seems unlikely. We could perhaps
- * prevent that by holding a file handle ourselves across the lstat() and
- * the retry loop, but that seems like over-engineering for now.
- *
- * In the special case of a STATUS_DELETE_PENDING error (file already
- * unlinked, but someone still has it open), we don't want to report ENOENT
- * to the caller immediately, because rmdir(parent) would probably fail.
- * We want to wait until the file truly goes away so that simple recursive
- * directory unlink algorithms work.
- */
- if (lstat(path, &st) < 0)
- {
- if (lstat_error_was_status_delete_pending())
- is_lnk = false;
- else
- return -1;
- }
- else
- is_lnk = S_ISLNK(st.st_mode);
/*
* We need to loop because even though PostgreSQL uses flags that allow
@@ -170,7 +285,11 @@ pgunlink(const char *path)
* someone else to close the file, as the caller might be holding locks
* and blocking other backends.
*/
- while ((is_lnk ? rmdir(path) : unlink(path)) < 0)
+#ifdef WIN32
+ while (pgwin32_posix_unlink(path) < 0)
+#else
+ while (unlink(path) < 0)
+#endif
{
if (errno != EACCES)
return -1;
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 680e2353d7..b5ef1de3c4 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -3,7 +3,7 @@
* Windows.
*
* Currently, have_posix_unlink_semantics is expected to be true on all Unix
- * systems and some Windows 10-based operatings using NTFS, and false on other
+ * systems and all Windows 10-based operatings using NTFS, and false on other
* Windows ReFS and SMB filesystems.
*/
@@ -442,10 +442,10 @@ filesystem_metadata_tests(void)
* Linux), though AIX/JFS1 is rumored to succeed. However, our Windows
* emulation doesn't allow it, because we want to avoid surprises by
* behaving like nearly all Unix systems. So we check this on Windows
- * only, where it fails with non-standard EACCES.
+ * only, where our wrapper fails with EPERM.
*/
PG_EXPECT_SYS(unlink(path2) == -1, "Windows: can't unlink() a directory");
- PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_EQ(errno, EPERM);
#endif
#ifdef WIN32
@@ -539,22 +539,23 @@ filesystem_metadata_tests(void)
fd = open(path2, O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "open name2.txt");
make_path(path2, "name2.txt");
-#ifdef WIN32
- /*
- * Windows can't rename over an open non-unlinked file, even with
- * have_posix_unlink_semantics.
- */
- pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
- PG_EXPECT_SYS(rename(path, path2) == -1,
- "Windows: can't rename name1.txt -> name2.txt while name2.txt is open");
- PG_EXPECT_EQ(errno, EACCES);
- PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
-#else
- PG_EXPECT_SYS(rename(path, path2) == 0,
- "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+ if (!have_posix_unlink_semantics)
+ {
+#ifdef WIN32
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
#endif
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows non-POSIX: can't rename name1.txt -> name2.txt while name2.txt is open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
+ }
+ else
+ {
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+ }
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
make_path(path, "name1.txt");
fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
--
2.35.1
I pushed the bug fixes from this series, without their accompanying
tests. Here's a rebase of the test suite, with all those tests now
squashed into the main test patch, and also the
tell-Windows-to-be-more-like-Unix patch. Registered in the
commitfest.
Attachments:
v3-0001-Add-suite-of-macros-for-writing-TAP-tests-in-C.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Add-suite-of-macros-for-writing-TAP-tests-in-C.patchDownload
From 1b70db7f4068df22b29e02d0c1b759195e0187ff Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 11 Oct 2022 17:26:02 +1300
Subject: [PATCH v3 1/4] Add suite of macros for writing TAP tests in C.
Historically we had to load test modules into a PostgreSQL backend or
write an ad-hoc standalone program to test C code. Let's provide a
convenient way to write stand-alone tests for low-level C code.
This commit defines a small set of macros that spit out results in TAP
format. These can be improved and extended as required.
Discussion: https://postgr.es/m/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com
---
src/include/lib/pg_tap.h | 138 +++++++++++++++++++++++++++++++++++++++
1 file changed, 138 insertions(+)
create mode 100644 src/include/lib/pg_tap.h
diff --git a/src/include/lib/pg_tap.h b/src/include/lib/pg_tap.h
new file mode 100644
index 0000000000..3da46ac5d5
--- /dev/null
+++ b/src/include/lib/pg_tap.h
@@ -0,0 +1,138 @@
+/*
+ * Simple macros for writing tests in C that print results in TAP format,
+ * as consumed by "prove".
+ *
+ * Specification for the output format: https://testanything.org/
+ */
+
+#ifndef PG_TAP_H
+#define PG_TAP_H
+
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+
+/* Counters are global, so we can break our tests into multiple functions. */
+static int pg_test_count;
+static int pg_fail_count;
+static int pg_todo_count;
+
+/*
+ * Require an expression to be true. Used for set-up steps that are not
+ * reported as a test.
+ */
+#define PG_REQUIRE(expr) \
+if (!(expr)) { \
+ printf("Bail out! requirement (" #expr ") failed at %s:%d\n", \
+ __FILE__, __LINE__); \
+ exit(EXIT_FAILURE); \
+}
+
+/*
+ * Like PG_REQUIRE, but log strerror(errno) before bailing.
+ */
+#define PG_REQUIRE_SYS(expr) \
+if (!(expr)) { \
+ printf("Bail out! requirement (" #expr ") failed at %s:%d, error: %s\n", \
+ __FILE__, __LINE__, strerror(errno)); \
+ exit(EXIT_FAILURE); \
+}
+
+/*
+ * Test that an expression is true. An optional message can be provided,
+ * defaulting to the expression itself if not provided.
+ */
+#define PG_EXPECT(expr, ...) \
+do { \
+ const char *messages[] = {#expr, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ pg_test_count++; \
+ if (expr) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s (at %s:%d)%s\n", pg_test_count, \
+ message, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * Like PG_EXPECT(), but also log strerror(errno) on failure.
+ */
+#define PG_EXPECT_SYS(expr, ...) \
+do { \
+ const char *messages[] = {#expr, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ pg_test_count++; \
+ if (expr) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s (at %s:%d), error: %s%s\n", pg_test_count, \
+ message, __FILE__, __LINE__, strerror(errno), \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+
+/*
+ * Test that one int expression is equal to another, logging the values if not.
+ */
+#define PG_EXPECT_EQ(expr1, expr2, ...) \
+do { \
+ const char *messages[] = {#expr1 " == " #expr2, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ int expr1_val = (expr1); \
+ int expr2_val = (expr2); \
+ pg_test_count++; \
+ if (expr1_val == expr2_val) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s: %d != %d (at %s:%d)%s\n", pg_test_count, \
+ message, expr1_val, expr2_val, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * Test that one C string expression is equal to another, logging the values if
+ * not.
+ */
+#define PG_EXPECT_EQ_STR(expr1, expr2, ...) \
+do { \
+ const char *messages[] = {#expr1 " matches " #expr2, __VA_ARGS__}; \
+ const char *message = messages[lengthof(messages) - 1]; \
+ const char *expr1_val = (expr1); \
+ const char *expr2_val = (expr2); \
+ pg_test_count++; \
+ if (strcmp(expr1_val, expr2_val) == 0) { \
+ printf("ok %d - %s\n", pg_test_count, message); \
+ } else { \
+ if (pg_todo_count == 0) \
+ pg_fail_count++; \
+ printf("not ok %d - %s: \"%s\" vs \"%s\" (at %s:%d) %s\n", \
+ pg_test_count, \
+ message, expr1_val, expr2_val, __FILE__, __LINE__, \
+ pg_todo_count > 0 ? " # TODO" : ""); \
+ } \
+} while (0)
+
+/*
+ * The main function of a test program should begin and end with these
+ * functions.
+ */
+#define PG_BEGIN_TESTS() \
+ setbuf(stdout, NULL); /* disable buffering for Windows */
+
+#define PG_END_TESTS() printf("1..%d\n", pg_test_count); \
+ return EXIT_SUCCESS
+
+#define PG_BEGIN_TODO() pg_todo_count++
+#define PG_END_TODO() pg_todo_count--
+
+#endif
--
2.35.1
v3-0002-meson-Add-infrastructure-for-TAP-tests-written-in.patchtext/x-patch; charset=US-ASCII; name=v3-0002-meson-Add-infrastructure-for-TAP-tests-written-in.patchDownload
From b5e633442ccc155cca1d093e435670bce3f0ceb1 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 11 Oct 2022 10:49:15 -0700
Subject: [PATCH v3 2/4] meson: Add infrastructure for TAP tests written in C.
Allow t/XXX.c files to be used as t/XXX.pl files normally are.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com
---
meson.build | 43 +++++++++++++++++++++++----
src/interfaces/libpq/test/meson.build | 8 ++---
src/tools/testwrap | 11 +++++--
3 files changed, 48 insertions(+), 14 deletions(-)
diff --git a/meson.build b/meson.build
index bfacbdc0af..bb73ad24a6 100644
--- a/meson.build
+++ b/meson.build
@@ -2561,6 +2561,10 @@ default_bin_args = default_target_args + {
'install_rpath': ':'.join(bin_install_rpaths),
}
+test_bin_args = default_target_args + {
+ 'install': false,
+}
+
# Helper for exporting a limited number of symbols
@@ -2920,7 +2924,14 @@ foreach test_dir : tests
endif
t = test_dir[kind]
+ env = test_env
+ # Add environment vars from the test specification
+ foreach name, value : t.get('env', {})
+ env.set(name, value)
+ endforeach
+
+ # pg_regress style tests
if kind in ['regress', 'isolation', 'ecpg']
if kind == 'regress'
runner = pg_regress
@@ -2954,7 +2965,6 @@ foreach test_dir : tests
test_command += t.get('sql', [])
endif
- env = test_env
env.prepend('PATH', temp_install_bindir, test_dir['bd'])
test_kwargs = {
@@ -2975,6 +2985,8 @@ foreach test_dir : tests
)
testport += 1
+
+ # perl tap tests
elif kind == 'tap'
if not tap_tests_enabled
continue
@@ -2988,13 +3000,8 @@ foreach test_dir : tests
# Add temporary install, the build directory for non-installed binaries and
# also test/ for non-installed test binaries built separately.
- env = test_env
env.prepend('PATH', temp_install_bindir, test_dir['bd'], test_dir['bd'] / 'test')
- foreach name, value : t.get('env', {})
- env.set(name, value)
- endforeach
-
test_kwargs = {
'protocol': 'tap',
'suite': [test_dir['name']],
@@ -3023,6 +3030,30 @@ foreach test_dir : tests
],
)
endforeach
+
+ # tap tests using C executables
+ elif kind == 'exec_tap'
+ # Don't have a dependency on perl tap test infrastructure, so we can
+ # test even when not tap_test_enabled.
+ test_kwargs = {
+ 'protocol': 'tap',
+ 'suite': [test_dir['name']],
+ 'timeout': 1000,
+ 'env': env,
+ } + t.get('test_kwargs', {})
+
+ foreach onetap : t['tests']
+ test_name = onetap.name()
+ test(test_dir['name'] / test_name,
+ python,
+ kwargs: test_kwargs,
+ depends: test_deps + t.get('deps', []) + [onetap],
+ args: testwrap_base + [
+ '--testname', test_name,
+ '--', onetap.full_path(),
+ ],
+ )
+ endforeach
else
error('unknown kind @0@ of test in @1@'.format(kind, test_dir['sd']))
endif
diff --git a/src/interfaces/libpq/test/meson.build b/src/interfaces/libpq/test/meson.build
index 017f729d43..06e026e400 100644
--- a/src/interfaces/libpq/test/meson.build
+++ b/src/interfaces/libpq/test/meson.build
@@ -11,9 +11,7 @@ endif
executable('libpq_uri_regress',
libpq_uri_regress_sources,
dependencies: [frontend_code, libpq],
- kwargs: default_bin_args + {
- 'install': false,
- }
+ kwargs: test_bin_args,
)
@@ -30,7 +28,5 @@ endif
executable('libpq_testclient',
libpq_testclient_sources,
dependencies: [frontend_code, libpq],
- kwargs: default_bin_args + {
- 'install': false,
- }
+ kwargs: default_bin_args,
)
diff --git a/src/tools/testwrap b/src/tools/testwrap
index 7a64fe76a2..5ca49b1fac 100755
--- a/src/tools/testwrap
+++ b/src/tools/testwrap
@@ -32,9 +32,16 @@ os.chdir(args.srcdir)
# mark test as having started
open(os.path.join(testdir, 'test.start'), 'x')
+testdatadir = os.path.join(testdir, 'data')
+testlogdir = os.path.join(testdir, 'log')
+
+# pre-create dirs so that the test's infrastructure doesn't have to
+os.makedirs(testdatadir)
+os.makedirs(testlogdir)
+
env_dict = {**os.environ,
- 'TESTDATADIR': os.path.join(testdir, 'data'),
- 'TESTLOGDIR': os.path.join(testdir, 'log')}
+ 'TESTDATADIR': testdatadir,
+ 'TESTLOGDIR': testlogdir}
sp = subprocess.run(args.test_command, env=env_dict)
--
2.35.1
v3-0003-Add-tests-for-Windows-filesystem-code-in-src-port.patchtext/x-patch; charset=US-ASCII; name=v3-0003-Add-tests-for-Windows-filesystem-code-in-src-port.patchDownload
From 3bb3f5a1e601d2c41c88f5de6f8b512f48e8b380 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 11 Oct 2022 17:26:02 +1300
Subject: [PATCH v3 3/4] Add tests for Windows filesystem code in src/port.
The wrappers we use for open(), unlink() etc had no direct testing.
These tests demonstrate the behavior of our Windows porting layer, with
two variants of underlying Windows behavior that are known to exist in
current versions of the operating system.
Since some of the functions contain internal sleep/retry loops, we need
a way to guarantee deterministic tests and perform asynchronous tasks,
so we also make the loop counts adjustable at runtime (only for use by
tests) and make pg_usleep() interruptable by Windows APC.
These tests are currently expected to pass on NTFS and ReFS. They
mostly pass on SMB, except for everything to do with symlinks, and one
readdir() test (see comments).
Discussion: https://postgr.es/m/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com
---
meson.build | 1 +
src/include/port.h | 2 +
src/port/dirmod.c | 7 +-
src/port/open.c | 5 +-
src/port/pgsleep.c | 2 +-
src/port/t/001_filesystem.c | 999 ++++++++++++++++++++++++++++++++++++
src/port/t/meson.build | 15 +
7 files changed, 1027 insertions(+), 4 deletions(-)
create mode 100644 src/port/t/001_filesystem.c
create mode 100644 src/port/t/meson.build
diff --git a/meson.build b/meson.build
index bb73ad24a6..ac7ce3b495 100644
--- a/meson.build
+++ b/meson.build
@@ -2778,6 +2778,7 @@ subdir('src')
subdir('contrib')
subdir('src/test')
+subdir('src/port/t')
subdir('src/interfaces/libpq/test')
subdir('src/interfaces/ecpg/test')
diff --git a/src/include/port.h b/src/include/port.h
index 02189d59b1..c332739799 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -273,6 +273,7 @@ extern int pclose_check(FILE *stream);
/*
* Win32 doesn't have reliable rename/unlink during concurrent access.
*/
+extern PGDLLEXPORT int pgwin32_dirmod_loops;
extern int pgrename(const char *from, const char *to);
extern int pgunlink(const char *path);
@@ -311,6 +312,7 @@ extern bool rmtree(const char *path, bool rmtopdir);
* passing of other special options.
*/
#define O_DIRECT 0x80000000
+extern PGDLLEXPORT int pgwin32_open_handle_loops;
extern HANDLE pgwin32_open_handle(const char *, int, bool);
extern int pgwin32_open(const char *, int,...);
extern FILE *pgwin32_fopen(const char *, const char *);
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index 3f45f6a655..0eb1e9566e 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -45,6 +45,9 @@
#if defined(WIN32) || defined(__CYGWIN__)
+/* Externally visable only to allow testing. */
+int pgwin32_dirmod_loops = 100;
+
/*
* pgrename
*/
@@ -88,7 +91,7 @@ pgrename(const char *from, const char *to)
return -1;
#endif
- if (++loops > 100) /* time out after 10 sec */
+ if (++loops > pgwin32_dirmod_loops) /* time out after 10 sec */
return -1;
pg_usleep(100000); /* us */
}
@@ -171,7 +174,7 @@ pgunlink(const char *path)
{
if (errno != EACCES)
return -1;
- if (++loops > 100) /* time out after 10 sec */
+ if (++loops > pgwin32_dirmod_loops) /* time out after 10 sec */
return -1;
pg_usleep(100000); /* us */
}
diff --git a/src/port/open.c b/src/port/open.c
index fd4faf604e..279a0ac08d 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -25,6 +25,9 @@
#include <assert.h>
#include <sys/stat.h>
+/* This is exported only to support testing. */
+int pgwin32_open_handle_loops = 300;
+
static int
openFlagsToCreateFileFlags(int openFlags)
{
@@ -118,7 +121,7 @@ pgwin32_open_handle(const char *fileName, int fileFlags, bool backup_semantics)
errhint("You might have antivirus, backup, or similar software interfering with the database system.")));
#endif
- if (loops < 300)
+ if (loops < pgwin32_open_handle_loops)
{
pg_usleep(100000);
loops++;
diff --git a/src/port/pgsleep.c b/src/port/pgsleep.c
index 03f0fac07b..9a37975c5d 100644
--- a/src/port/pgsleep.c
+++ b/src/port/pgsleep.c
@@ -53,7 +53,7 @@ pg_usleep(long microsec)
delay.tv_usec = microsec % 1000000L;
(void) select(0, NULL, NULL, NULL, &delay);
#else
- SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), FALSE);
+ SleepEx((microsec < 500 ? 1 : (microsec + 500) / 1000), TRUE);
#endif
}
}
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
new file mode 100644
index 0000000000..99bfd6bdd5
--- /dev/null
+++ b/src/port/t/001_filesystem.c
@@ -0,0 +1,999 @@
+/*
+ * Tests for our partial implementations of POSIX-style filesystem APIs, for
+ * Windows.
+ *
+ * Currently, have_posix_unlink_semantics is expected to be true on all Unix
+ * systems and some Windows 10-based operatings using NTFS, and false on other
+ * Windows ReFS and SMB filesystems.
+ */
+
+#include "c.h"
+
+#include <dirent.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/stat.h>
+
+#include "lib/pg_tap.h"
+#include "port/pg_iovec.h"
+
+/*
+ * Make a path under the tmp_check directory. TESTDATADIR is expected to
+ * contain an absolute path.
+ */
+static void
+make_path(char *buffer, const char *name)
+{
+ const char *directory;
+
+ directory = getenv("TESTDATADIR");
+ PG_REQUIRE(directory);
+
+ snprintf(buffer, MAXPGPATH, "%s/%s", directory, name);
+}
+
+/*
+ * Check if a directory contains a given entry, according to readdir().
+ */
+static bool
+directory_contains(const char *directory_path, const char *name)
+{
+ char directory_full_path[MAXPGPATH];
+ DIR *dir;
+ struct dirent *de;
+ bool saw_name = false;
+
+ make_path(directory_full_path, directory_path);
+ PG_REQUIRE_SYS((dir = opendir(directory_full_path)));
+ errno = 0;
+ while ((de = readdir(dir)))
+ {
+ if (strcmp(de->d_name, ".") == 0 ||
+ strcmp(de->d_name, "..") == 0)
+ continue;
+ if (strcmp(de->d_name, name) == 0)
+ saw_name = true;
+ }
+ PG_REQUIRE_SYS(errno == 0);
+ PG_REQUIRE_SYS(closedir(dir) == 0);
+
+ return saw_name;
+}
+
+#ifdef WIN32
+/*
+ * Make all slashes lean one way, to normalize paths for comparisons on Windows.
+ */
+static void
+normalize_slashes(char *path)
+{
+ while (*path)
+ {
+ if (*path == '/')
+ *path = '\\';
+ path++;
+ }
+}
+
+typedef struct apc_trampoline_data
+{
+ HANDLE timer;
+ void (*function) (void *);
+ void *data;
+} apc_trampoline_data;
+
+static void CALLBACK
+apc_trampoline(void *data, DWORD low, DWORD high)
+{
+ apc_trampoline_data *x = data;
+
+ x->function(x->data);
+ CloseHandle(x->timer);
+ free(data);
+}
+
+/*
+ * Schedule a Windows APC callback. The pg_usleep() call inside the
+ * sleep/retry loops in the wrappers under test is interruptible by APCs,
+ * causing it to run the procedure and return early.
+ */
+static void
+run_async_procedure_after_delay(void (*p) (void *), void *data, int milliseconds)
+{
+ LARGE_INTEGER delay_time = {.LowPart = -10 * milliseconds};
+ apc_trampoline_data *x;
+ HANDLE timer;
+
+ timer = CreateWaitableTimer(NULL, false, NULL);
+ PG_REQUIRE(timer);
+
+ x = malloc(sizeof(*x));
+ PG_REQUIRE_SYS(x);
+ x->timer = timer;
+ x->function = p;
+ x->data = data;
+
+ PG_REQUIRE(SetWaitableTimer(timer, &delay_time, 0, apc_trampoline, x, false));
+}
+
+static void
+close_fd(void *data)
+{
+ int *fd = data;
+
+ PG_REQUIRE_SYS(close(*fd) == 0);
+}
+
+static void
+close_handle(void *data)
+{
+ HANDLE handle = data;
+
+ PG_REQUIRE(CloseHandle(handle));
+}
+#endif
+
+/*
+ * Tests of directory entry manipulation.
+ */
+static void
+filesystem_metadata_tests(void)
+{
+ int fd;
+ int fd2;
+ char path[MAXPGPATH];
+ char path2[MAXPGPATH];
+ char path3[MAXPGPATH];
+ struct stat statbuf;
+ ssize_t size;
+ bool have_posix_unlink_semantics;
+#ifdef WIN32
+ HANDLE handle;
+#endif
+
+ /*
+ * Current versions of Windows 10 have "POSIX semantics" when unlinking
+ * files, meaning that unlink() and rename() remove the directory entry
+ * synchronously, just like Unix. At least some server OSes still seem to
+ * have the traditional Windows behavior, where directory entries remain
+ * in STATUS_DELETE_PENDING state, visible but unopenable, until all file
+ * handles are closed. We have a lot of code paths to deal with the older
+ * asynchronous behavior, and tests for those here. Before we go further,
+ * determine which behavior to expect. Behavior may also vary on non-NTFS
+ * filesystems.
+ */
+ make_path(path, "test.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ PG_REQUIRE_SYS(unlink(path) == 0);
+ fd2 = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd2 >= 0 || errno == EEXIST);
+ if (fd2 >= 0)
+ {
+ /* Expected behavior on Unix and some Windows 10 releases. */
+ PG_EXPECT_SYS(unlink(path) == 0, "POSIX unlink semantics detected: cleaning up");
+ have_posix_unlink_semantics = true;
+ PG_REQUIRE_SYS(close(fd2) == 0);
+ }
+ else
+ {
+ /* Our open wrapper fails with EEXIST for O_EXCL in this case. */
+ PG_EXPECT_SYS(errno == EEXIST,
+ "traditional Windows unlink semantics detected: O_EXCL -> EEXIST");
+ have_posix_unlink_semantics = false;
+ }
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Set up test directory structure. */
+
+ make_path(path, "dir1");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir1/my_subdir");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir1/test.txt");
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ PG_REQUIRE_SYS(write(fd, "hello world\n", 12) == 12);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Tests for symlink()/readlink(). */
+
+ make_path(path, "dir999/my_symlink"); /* name of symlink to create */
+ make_path(path2, "dir1/my_subdir"); /* name of directory it will point to */
+ PG_EXPECT(symlink(path2, path) == -1, "symlink fails on missing parent");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/my_symlink"); /* name of symlink to create */
+ make_path(path2, "dir1/my_subdir"); /* name of directory it will point to */
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "create symlink");
+
+ size = readlink(path, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, strlen(path2), "readlink reports expected size");
+ path3[Max(size, 0)] = '\0';
+#ifdef WIN32
+ normalize_slashes(path2);
+ normalize_slashes(path3);
+#endif
+ PG_EXPECT_EQ_STR(path2, path3, "readlink reports expected target");
+
+ PG_EXPECT(readlink("does-not-exist", path3, sizeof(path3)) == -1, "readlink fails on missing path");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ /*
+ * Checks that we don't corrupt non-drive-absolute paths when peforming
+ * internal conversions.
+ */
+
+ /*
+ * Typical case: Windows drive absolute. This should also be accepted on
+ * POSIX systems, because they are required not to validate the target
+ * string as a path.
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("c:\\foo", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, 6);
+ path3[Max(size, 0)] = '\0';
+ PG_EXPECT_EQ_STR(path3, "c:\\foo");
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
+ /*
+ * Drive absolute given in full NT format will be stripped on round-trip
+ * through our Windows emulations.
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("\\??\\c:\\foo", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ path3[Max(size, 0)] = '\0';
+#ifdef WIN32
+ PG_EXPECT_EQ(size, 6);
+ PG_EXPECT_EQ_STR(path3, "c:\\foo");
+#else
+ PG_EXPECT_EQ(size, 10);
+ PG_EXPECT_EQ_STR(path3, "\\??\\c:\\foo");
+#endif
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
+ /*
+ * Anything that doesn't look like the NT pattern that symlink() creates
+ * will be returned verbatim. This will allow our stat() to handle paths
+ * that were not created by symlink().
+ */
+ make_path(path2, "my_symlink");
+ PG_EXPECT_SYS(symlink("\\??\\Volume1234", path2) == 0);
+ size = readlink(path2, path3, sizeof(path3));
+ PG_EXPECT_SYS(size != -1, "readlink succeeds");
+ PG_EXPECT_EQ(size, 14);
+ path3[Max(size, 0)] = '\0';
+ PG_EXPECT_EQ_STR(path3, "\\??\\Volume1234");
+ PG_EXPECT_SYS(unlink(path2) == 0);
+
+ /* Tests for fstat(). */
+
+ make_path(path, "dir1/test.txt");
+ fd = open(path, O_RDWR, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(fstat(fd, &statbuf) == 0, "fstat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Tests for stat(). */
+
+ PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "stat missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ make_path(path, "dir1/my_subdir");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat directory");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ make_path(path, "dir1/my_symlink");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == 0, "stat symlink");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ /* Recursive symlinks. */
+ make_path(path, "dir1");
+ make_path(path2, "sym001");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym001 -> dir1");
+ make_path(path, "sym001");
+ make_path(path2, "sym002");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym002 -> sym001");
+ make_path(path, "sym002");
+ make_path(path2, "sym003");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym003 -> sym002");
+ make_path(path, "sym003");
+ make_path(path2, "sym004");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym004 -> sym003");
+ make_path(path, "sym004");
+ make_path(path2, "sym005");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym005 -> sym004");
+ make_path(path, "sym005");
+ make_path(path2, "sym006");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym006 -> sym005");
+ make_path(path, "sym006");
+ make_path(path2, "sym007");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym007 -> sym006");
+ make_path(path, "sym007");
+ make_path(path2, "sym008");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym008 -> sym007");
+ make_path(path, "sym008");
+ make_path(path2, "sym009");
+ PG_EXPECT_SYS(symlink(path, path2) == 0, "sym009 -> sym008");
+
+ /* POSIX says SYMLOOP_MAX should be at least 8. */
+ make_path(path, "sym008");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT_SYS(stat(path, &statbuf) == 0, "stat sym008");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+#ifdef WIN32
+
+ /*
+ * Test ELOOP failure in our Windows implementation of stat(), because we
+ * know it gives up after 8.
+ */
+ make_path(path, "sym009");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == -1, "Windows: stat sym009 fails");
+ PG_EXPECT_EQ(errno, ELOOP);
+#endif
+
+ /* If we break the chain we get ENOENT. */
+ make_path(path, "sym003");
+ PG_EXPECT_SYS(unlink(path) == 0);
+ make_path(path, "sym008");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(stat(path, &statbuf) == -1, "stat broken symlink chain fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ /* Tests for lstat(). */
+
+ PG_EXPECT(stat("does-not-exist.txt", &statbuf) == -1, "lstat missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat regular file");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ make_path(path2, "dir1/my_subdir");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path2, &statbuf) == 0, "lstat directory");
+ PG_EXPECT(S_ISDIR(statbuf.st_mode));
+
+ make_path(path, "dir1/my_symlink");
+ memset(&statbuf, 0, sizeof(statbuf));
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat symlink");
+ PG_EXPECT(S_ISLNK(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, strlen(path2), "got expected symlink size");
+
+ make_path(path, "broken-symlink");
+ make_path(path2, "does-not-exist");
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "make a broken symlink");
+ PG_EXPECT_SYS(lstat(path, &statbuf) == 0, "lstat broken symlink");
+ PG_EXPECT(S_ISLNK(statbuf.st_mode));
+ PG_EXPECT_SYS(unlink(path) == 0);
+
+ /* Tests for link() and unlink(). */
+
+ make_path(path, "does-not-exist-1");
+ make_path(path2, "does-not-exist-2");
+ PG_EXPECT(link(path, path2) == -1, "link missing file fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1/test.txt");
+ make_path(path2, "dir1/test2.txt");
+ PG_EXPECT_SYS(link(path, path2) == 0, "link succeeds");
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 1 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 2);
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 2 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 2);
+
+ PG_EXPECT_SYS(unlink(path2) == 0, "unlink succeeds");
+
+ PG_EXPECT(lstat(path, &statbuf) == 0, "lstat link 1 succeeds");
+ PG_EXPECT(S_ISREG(statbuf.st_mode));
+ PG_EXPECT_EQ(statbuf.st_size, 12);
+ PG_EXPECT_EQ(statbuf.st_nlink, 1);
+
+ PG_EXPECT_SYS(lstat(path2, &statbuf) == -1, "lstat link 2 fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ /*
+ * On Windows we have a special code-path to make unlink() work on
+ * junction points created by our symlink() wrapper, to match POSIX
+ * behavior.
+ */
+ make_path(path, "unlink_me_symlink");
+ make_path(path2, "dir1");
+ PG_EXPECT_SYS(symlink(path2, path) == 0, "create symlink");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink symlink");
+
+#ifdef WIN32
+
+ /*
+ * But make sure that it doesn't work on plain old directories.
+ *
+ * POSIX doesn't specify whether unlink() works on a directory, so we
+ * don't check that on non-Windows. In practice almost all known systems
+ * fail with EPERM (POSIX systems) or EISDIR (non-standard error on
+ * Linux), though AIX/JFS1 is rumored to succeed. However, our Windows
+ * emulation doesn't allow it, because we want to avoid surprises by
+ * behaving like nearly all Unix systems. So we check this on Windows
+ * only, where it fails with non-standard EACCES.
+ */
+ PG_EXPECT_SYS(unlink(path2) == -1, "Windows: can't unlink() a directory");
+ PG_EXPECT_EQ(errno, EACCES);
+#endif
+
+#ifdef WIN32
+
+ /*
+ * Test that we automatically retry for a while if we get a sharing
+ * violation because external software does not use FILE_SHARE_* when
+ * opening our files. See similar open() test for explanation.
+ */
+
+ make_path(path, "name1.txt");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(unlink(path) == -1, "Windows: can't unlink file while non-shared handle exists");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(unlink(path) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+#endif
+
+ /*
+ * Our Windows unlink() wrapper blocks in a retry loop if you try to
+ * unlink a file in STATUS_DELETE_PENDING (ie that has already been
+ * unlinked but is still open), until it times out with EACCES or reaches
+ * ENOENT. That may be useful for waiting for files to be asynchronously
+ * unlinked while performing a recursive unlink on
+ * !have_posix_unlink_semantics systems, so that rmdir(parent) works.
+ */
+ make_path(path, "dir2");
+ PG_EXPECT_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir2/test-file");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open dir2/test-file");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink file while it's open, once");
+#ifdef WIN32
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+#endif
+ PG_EXPECT(unlink(path) == -1, "can't unlink again");
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_EQ(errno, ENOENT, "POSIX: we expect ENOENT");
+ }
+ else
+ {
+ PG_EXPECT_EQ(errno, EACCES, "Windows non-POSIX: we expect EACCES (delete already pending)");
+#ifdef WIN32
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure
+ * our 100ms callback is run */
+ run_async_procedure_after_delay(close_fd, &fd, 100); /* close fd after 100ms */
+#endif
+ PG_EXPECT(unlink(path) == -1, "Windows non-POSIX: trying again fails");
+ PG_EXPECT_EQ(errno, ENOENT, "Windows non-POSIX: ... but blocked until ENOENT was reached due to asynchronous close");
+ }
+ make_path(path, "dir2");
+ PG_EXPECT_SYS(rmdir(path) == 0, "now we can remove the directory");
+
+ /* Tests for rename(). */
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt");
+
+ PG_EXPECT_EQ(open(path, O_RDWR | PG_BINARY, 0777), -1, "can't open name1.txt anymore");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0, "rename name1.txt -> name2.txt, replacing it");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ fd = open(path2, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open name2.txt");
+ make_path(path2, "name2.txt");
+#ifdef WIN32
+
+ /*
+ * Windows can't rename over an open non-unlinked file, even with
+ * have_posix_unlink_semantics.
+ */
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows: can't rename name1.txt -> name2.txt while name2.txt is open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
+#else
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+#endif
+ PG_EXPECT_SYS(close(fd) == 0);
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ /* leave open */
+ make_path(path2, "name2.txt");
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "can rename name1.txt -> name2.txt while name1.txt is open");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ make_path(path, "name1.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ fd = open(path2, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open name2.txt");
+ PG_EXPECT_SYS(unlink(path2) == 0, "unlink name2.txt while it is still open");
+
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: rename name1.txt -> name2.txt while unlinked file is still open");
+ }
+ else
+ {
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows non-POSIX: cannot rename name1.txt -> name2.txt while unlinked file is still open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_SYS(unlink(path) == 0);
+ }
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+#ifdef WIN32
+
+ /*
+ * Test that we automatically retry for a while if we get a sharing
+ * violation because external software does not use FILE_SHARE_* when
+ * opening our files. See similar open() test for explanation.
+ */
+
+ make_path(path, "name1.txt");
+ make_path(path2, "name2.txt");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(rename(path, path2) == -1, "Windows: can't rename file while non-shared handle exists for name1");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(rename(path, path2) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "touch name1.txt");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ handle = CreateFile(path2, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
+ PG_EXPECT(rename(path, path2) == -1, "Windows: can't rename file while non-shared handle exists for name2");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_dirmod_loops = 1800; /* loop for up to 180s to make sure our
+ * 100ms callback is run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ PG_EXPECT_SYS(rename(path, path2) == 0, "Windows: can rename file after non-shared handle asynchronously closed");
+
+ PG_REQUIRE_SYS(unlink(path2) == 0);
+#endif
+
+ /* Tests for open(). */
+
+ make_path(path, "test.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open(O_CREAT | O_EXCL) succeeds");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_EQ(fd, -1, "open(O_CREAT | O_EXCL) again fails");
+ PG_EXPECT_EQ(errno, EEXIST);
+
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open(O_CREAT) succeeds");
+
+ /*
+ * We can unlink a file that we still have an open handle for on Windows,
+ * because our open() wrapper used FILE_SHARE_DELETE.
+ */
+ PG_EXPECT_SYS(unlink(path) == 0);
+
+ /*
+ * On older Windows, our open wrapper on Windows will see
+ * STATUS_DELETE_PENDING and pretend it can't see the file. On newer
+ * Windows, it won't see the file. Either way matches expected POSIX
+ * behavior.
+ */
+ PG_EXPECT(open(path, O_RDWR | PG_BINARY, 0777) == -1);
+ PG_EXPECT(errno == ENOENT);
+
+ /*
+ * Things are trickier if you asked to create a file. ENOENT doesn't make
+ * sense as a error number to a program expecting POSIX behavior then, so
+ * our wrpaper uses EEXIST for lack of anything more appropriate.
+ */
+ fd2 = open(path, O_RDWR | PG_BINARY | O_CREAT, 0777);
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT_SYS(fd2 >= 0, "POSIX: create file with same name, while unlinked file is still open");
+ PG_EXPECT_SYS(close(fd2) == 0);
+ }
+ else
+ {
+ PG_EXPECT_SYS(fd2 == -1,
+ "Windows non-POSIX: cannot create file with same name, while unlinked file is still open");
+ PG_EXPECT_EQ(errno, EEXIST);
+ }
+
+ /* After closing the handle, Windows non-POSIX can now create a new file. */
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+ fd = open(path, O_RDWR | PG_BINARY | O_CREAT, 0777);
+ PG_EXPECT_SYS(fd >= 0,
+ "can create a file with recently unlinked name after closing");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+#ifdef WIN32
+
+ /*
+ * Even on systems new enough to have POSIX unlink behavior, the problem
+ * of sharing violations still applies.
+ *
+ * Our own open() wrapper is careful to use the FILE_SHARE_* flags to
+ * avoid the dreaded ERROR_SHARING_VIOLATION, but it's possible that
+ * another program might open one of our files without them. In that
+ * case, our open() wrapper will sleep and retry for a long time, in the
+ * hope that other program goes away. So let's open a handle with
+ * antisocial flags. We'll adjust the loop count via a side-hatch so we
+ * don't waste time in this test, though.
+ */
+ handle = CreateFile(path, GENERIC_WRITE | GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
+ PG_REQUIRE(handle);
+
+ pgwin32_open_handle_loops = 2; /* minimize retries so test is fast */
+ PG_EXPECT(open(path, O_RDWR | PG_BINARY, 0777) == -1, "Windows: can't open file while non-shared handle exists");
+ PG_EXPECT_EQ(errno, EACCES);
+
+ pgwin32_open_handle_loops = 18000; /* 180s must be long enough for our
+ * async callback to run */
+ run_async_procedure_after_delay(close_handle, handle, 100); /* close handle after
+ * 100ms */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "Windows: can open file after non-shared handle asynchronously closed");
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+#endif
+
+ PG_EXPECT_SYS(unlink(path) == 0);
+
+ /* Tests for opendir(), readdir(), closedir(). */
+ {
+ DIR *dir;
+ struct dirent *de;
+ int dot = -1;
+ int dotdot = -1;
+ int my_subdir = -1;
+ int my_symlink = -1;
+ int test_txt = -1;
+
+ make_path(path, "does-not-exist");
+ PG_EXPECT(opendir(path) == NULL, "open missing directory fails");
+ PG_EXPECT_EQ(errno, ENOENT);
+
+ make_path(path, "dir1");
+ PG_EXPECT_SYS((dir = opendir(path)), "open directory");
+
+#ifdef DT_REG
+/*
+ * Linux, *BSD, macOS and our Windows wrappers have BSD d_type. On a few rare
+ * file systems, it may be reported as DT_UNKNOWN so we have to tolerate that too.
+ */
+#define LOAD(name, variable) if (strcmp(de->d_name, name) == 0) variable = de->d_type
+#define CHECK(name, variable, type) \
+ PG_EXPECT(variable != -1, name " was found"); \
+ PG_EXPECT(variable == DT_UNKNOWN || variable == type, name " has type DT_UNKNOWN or " #type)
+#else
+/*
+ * Solaris, AIX and Cygwin do not have it (it's not in POSIX). Just check that
+ * we saw the file and ignore the type.
+ */
+#define LOAD(name, variable) if (strcmp(de->d_name, name) == 0) variable = 0
+#define CHECK(name, variable, type) \
+ PG_EXPECT(variable != -1, name " was found")
+#endif
+
+ /* Load and check in two phases because the order is unknown. */
+ errno = 0;
+ while ((de = readdir(dir)))
+ {
+ LOAD(".", dot);
+ LOAD("..", dotdot);
+ LOAD("my_subdir", my_subdir);
+ LOAD("my_symlink", my_symlink);
+ LOAD("test.txt", test_txt);
+ }
+ PG_EXPECT_SYS(errno == 0, "ran out of dirents without error");
+
+ CHECK(".", dot, DT_DIR);
+ CHECK("..", dotdot, DT_DIR);
+ CHECK("my_subdir", my_subdir, DT_DIR);
+ CHECK("my_symlink", my_symlink, DT_LNK);
+ CHECK("test.txt", test_txt, DT_REG);
+
+#undef LOAD
+#undef CHECK
+ }
+
+ /*
+ * On older Windows, readdir() sees entries that are in
+ * STATUS_DELETE_PENDING state, and they prevent the directory itself from
+ * being unlinked. Demonstrate that difference here.
+ */
+ make_path(path, "dir2");
+ PG_REQUIRE_SYS(mkdir(path, 0777) == 0);
+ make_path(path, "dir2/test.txt");
+ fd = open(path, O_CREAT | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file");
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink file while still open");
+ if (have_posix_unlink_semantics)
+ {
+ PG_EXPECT(!directory_contains("dir2", "test.txt"), "POSIX: readdir doesn't see it before closing");
+ }
+ else
+ {
+ /*
+ * This test fails on Windows SMB filesystems (AKA network drives):
+ * test.txt is not visible to our readdir() wrapper, because SMB seems
+ * to hide STATUS_DELETE_PENDING files.
+ */
+ PG_EXPECT(directory_contains("dir2", "test.txt"), "Windows non-POSIX: readdir still sees it before closing");
+ }
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+ PG_EXPECT(!directory_contains("dir2", "test.txt"), "readdir doesn't see it after closing");
+}
+
+/*
+ * Tests for pread, pwrite, and vector variants.
+ */
+static void
+filesystem_io_tests(void)
+{
+ int fd;
+ char path[MAXPGPATH];
+ char buffer[80];
+ char buffer2[80];
+ struct iovec iov[8];
+
+ /*
+ * These are our own wrappers on Windows. On Unix, they're just macros
+ * for system APIs, except on Solaris which shares the pg_{read,write}v
+ * replacements used on Windows.
+ */
+
+ make_path(path, "io_test_file.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "create a new file");
+ PG_REQUIRE(fd >= 0);
+
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "initial file position is zero");
+
+ PG_EXPECT_EQ(pg_pwrite(fd, "llo", 3, 2), 3);
+ PG_EXPECT_EQ(pg_pwrite(fd, "he", 2, 0), 2);
+
+ /*
+ * On Windows, the current position moves, which is why we have the pg_
+ * prefixes as a warning to users. Demonstrate that difference here.
+ */
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 2, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ memset(buffer, 0, sizeof(buffer));
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 2, 3), 2);
+ PG_EXPECT(memcmp(buffer, "lo", 2) == 0);
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 3, 0), 3);
+ PG_EXPECT(memcmp(buffer, "hel", 3) == 0);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 3, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 80, 0), 5, "pg_pread short read");
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 80, 5), 0, "pg_pread EOF");
+
+ iov[0].iov_base = "wo";
+ iov[0].iov_len = 2;
+ iov[1].iov_base = "rld";
+ iov[1].iov_len = 3;
+ PG_EXPECT_EQ(pg_pwritev(fd, iov, 2, 5), 5);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 10, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ memset(buffer, 0, sizeof(buffer));
+ memset(buffer2, 0, sizeof(buffer2));
+ iov[0].iov_base = buffer;
+ iov[0].iov_len = 4;
+ iov[1].iov_base = buffer2;
+ iov[1].iov_len = 4;
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 1), 8);
+
+#ifdef WIN32
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 9, "Windows: file position moved");
+#else
+ PG_EXPECT_EQ(lseek(fd, 0, SEEK_CUR), 0, "POSIX: file position didn't move");
+#endif
+
+ PG_EXPECT(memcmp(buffer, "ello", 4) == 0);
+ PG_EXPECT(memcmp(buffer2, "worl", 4) == 0);
+
+ memset(buffer, 0, sizeof(buffer));
+ memset(buffer2, 0, sizeof(buffer2));
+ iov[0].iov_base = buffer;
+ iov[0].iov_len = 1;
+ iov[1].iov_base = buffer2;
+ iov[1].iov_len = 80;
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 8), 2);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 9), 1);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 10), 0);
+ PG_EXPECT_EQ(pg_preadv(fd, iov, 2, 11), 0);
+
+ memset(buffer, 0, sizeof(buffer));
+ PG_EXPECT_EQ(pg_pread(fd, buffer, 10, 0), 10);
+ PG_EXPECT_EQ_STR(buffer, "helloworld");
+
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Demonstrate the effects of "text" mode (no PG_BINARY flag). */
+
+ /* Write out a message in text mode. */
+ fd = open(path, O_RDWR, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(write(fd, "hello world\n", 12), 12);
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
+
+ /* Read it back in binary mode, to reveal the translation. */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in binary mode");
+ PG_REQUIRE(fd >= 0);
+#ifdef WIN32
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 13,
+ "Windows: \\n was translated");
+ PG_EXPECT(memcmp(buffer, "hello world\r\n", 13) == 0,
+ "Windows: \\n was translated to \\r\\n");
+#else
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 12,
+ "POSIX: \\n was not translated");
+ PG_EXPECT(memcmp(buffer, "hello world\n", 12) == 0,
+ "POSIX: \\n is \\n");
+#endif
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* The opposite translation happens in text mode, hiding it. */
+ fd = open(path, O_RDWR, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(read(fd, buffer, sizeof(buffer)), 12);
+ PG_EXPECT(memcmp(buffer, "hello world\n", 12) == 0);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+
+ /*
+ * Write out a message in text mode, this time with pg_pwrite(), which
+ * does not suffer from newline translation.
+ */
+ fd = open(path, O_RDWR | O_CREAT, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in text mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(pg_pwrite(fd, "hello world\n", 12, 0), 12);
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ /* Read it back in binary mode to verify that. */
+ fd = open(path, O_RDWR | PG_BINARY, 0777);
+ PG_EXPECT_SYS(fd >= 0, "open file in binary mode");
+ PG_REQUIRE(fd >= 0);
+ PG_EXPECT_EQ(pg_pread(fd, buffer, sizeof(buffer), 0), 12,
+ "\\n was not translated by pg_pread()");
+ PG_REQUIRE_SYS(close(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+}
+
+/*
+ * Exercise fsync and fdatasync.
+ */
+static void
+filesystem_sync_tests(void)
+{
+ int fd;
+ char path[MAXPGPATH];
+
+ make_path(path, "sync_me.txt");
+ fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
+ PG_REQUIRE_SYS(fd >= 0);
+
+ PG_REQUIRE_SYS(write(fd, "x", 1) == 1);
+ PG_EXPECT_SYS(fsync(fd) == 0);
+
+ PG_REQUIRE_SYS(write(fd, "x", 1) == 1);
+ PG_EXPECT_SYS(fdatasync(fd) == 0);
+
+ PG_REQUIRE_SYS(unlink(path) == 0);
+}
+
+int
+main()
+{
+ PG_BEGIN_TESTS();
+
+ filesystem_metadata_tests();
+ filesystem_io_tests();
+ filesystem_sync_tests();
+
+ PG_END_TESTS();
+}
diff --git a/src/port/t/meson.build b/src/port/t/meson.build
new file mode 100644
index 0000000000..bc5c7fc0b8
--- /dev/null
+++ b/src/port/t/meson.build
@@ -0,0 +1,15 @@
+port_tests = []
+port_tests += executable('001_filesystem',
+ ['001_filesystem.c'],
+ dependencies: [frontend_code],
+ kwargs: test_bin_args,
+)
+
+tests += {
+ 'name': 'port',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'exec_tap': {
+ 'tests': port_tests,
+ },
+}
--
2.35.1
v3-0004-Use-POSIX-semantics-for-unlink-and-rename-on-Wind.patchtext/x-patch; charset=US-ASCII; name=v3-0004-Use-POSIX-semantics-for-unlink-and-rename-on-Wind.patchDownload
From 7f5431d7c586cc970aa2279a1fc1d2938b909253 Mon Sep 17 00:00:00 2001
From: Thomas <thomas.munro@gmail.com>
Date: Mon, 17 Oct 2022 22:41:18 -0700
Subject: [PATCH v3 4/4] Use POSIX semantics for unlink() and rename() on
Windows.
Use SetInformationByHandle() to implement unlink and rename with POSIX
semantics. This should work on all Windows 10-based systems using NTFS.
On ReFS and SMB, it currently falls back to non-POSIX semantics for file
links, so we still maintain tests that demonstrate that behavior.
Author: Victor Spirin <v.spirin@postgrespro.ru>
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/a529b660-da15-5b62-21a0-9936768210fd%40postgrespro.ru
Discussion: https://postgr.es/m/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com
---
src/port/dirmod.c | 239 +++++++++++++++++++++++++++---------
src/port/t/001_filesystem.c | 35 +++---
2 files changed, 197 insertions(+), 77 deletions(-)
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index 0eb1e9566e..8a0aa32f9d 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -39,15 +39,125 @@
#endif
#endif
-#if defined(WIN32) && !defined(__CYGWIN__)
-#include "port/win32ntdll.h"
-#endif
-
#if defined(WIN32) || defined(__CYGWIN__)
/* Externally visable only to allow testing. */
int pgwin32_dirmod_loops = 100;
+#ifdef WIN32
+
+/*
+ * XXX Because mingw doesn't yet define struct FILE_RENAME_INFO with the Flags
+ * member, we'll define a layout-compatible struct ourselves for now. See:
+ *
+ * https://learn.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_rename_info
+ */
+typedef struct XXX_FILE_RENAME_INFO
+{
+ union
+ {
+ BOOLEAN ReplaceIfExists;
+ DWORD Flags;
+ } DUMMYUNIONNAME;
+ HANDLE RootDirectory;
+ DWORD FileNameLength;
+ WCHAR FileName[1];
+} XXX_FILE_RENAME_INFO;
+
+/*
+ * XXX Because mingw seems to believe we need a higher _WIN32_WINNT than the
+ * Windows SDK requires for some of these macros, define them ourselves if
+ * necessary.
+ */
+#ifndef FILE_RENAME_FLAG_REPLACE_IF_EXISTS
+#define FILE_RENAME_FLAG_REPLACE_IF_EXISTS 0x00000001
+#endif
+#ifndef FILE_RENAME_FLAG_POSIX_SEMANTICS
+#define FILE_RENAME_FLAG_POSIX_SEMANTICS 0x00000002
+#endif
+#ifndef FILE_DISPOSITION_DELETE
+#define FILE_DISPOSITION_DELETE 0x00000001
+#endif
+#ifndef FILE_DISPOSITION_POSIX_SEMANTICS
+#define FILE_DISPOSITION_POSIX_SEMANTICS 0x00000002
+#endif
+/* Can't use macro tricks for FILE_INFO_BY_HANDLE_CLASS enumator names. */
+#define XXX_FileDispositionInfoEx 0x15
+#define XXX_FileRenameInfoEx 0x16
+
+/*
+ * A container for FILE_RENAME_INFO that adds trailing space for FileName.
+ */
+typedef struct FILE_RENAME_INFO_EXT
+{
+ XXX_FILE_RENAME_INFO fri;
+ wchar_t extra_space[MAXPGPATH];
+} FILE_RENAME_INFO_EXT;
+
+static int
+pgwin32_posix_rename(const char *from, const char *to)
+{
+ FILE_RENAME_INFO_EXT rename_info = {{.Flags = 0}};
+ HANDLE handle;
+
+ if (MultiByteToWideChar(CP_ACP, 0, to, -1, rename_info.fri.FileName, MAXPGPATH) == 0)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ rename_info.fri.Flags = FILE_RENAME_FLAG_POSIX_SEMANTICS | FILE_RENAME_FLAG_REPLACE_IF_EXISTS;
+ rename_info.fri.FileNameLength = wcslen(rename_info.fri.FileName);
+
+ handle = CreateFile(from,
+ GENERIC_READ | GENERIC_WRITE | DELETE,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+ if (handle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+
+ if (!SetFileInformationByHandle(handle,
+ XXX_FileRenameInfoEx,
+ &rename_info,
+ sizeof(FILE_RENAME_INFO_EXT)))
+ {
+ DWORD error = GetLastError();
+
+ /*
+ * ReFS currently fails, so we'll try again without POSIX semantics.
+ * Likewise for SMB, except it helpfully fails with a different more
+ * general error.
+ */
+ if (error == ERROR_NOT_SUPPORTED || error == ERROR_INVALID_PARAMETER)
+ {
+ /* Try the older FileRenameInfo (no "Ex", no Flags). */
+ rename_info.fri.ReplaceIfExists = true;
+ if (!SetFileInformationByHandle(handle, FileRenameInfo, &rename_info,
+ sizeof(FILE_RENAME_INFO_EXT)))
+ {
+ _dosmaperr(GetLastError());
+ CloseHandle(handle);
+ return -1;
+ }
+ }
+ else
+ {
+ _dosmaperr(error);
+ CloseHandle(handle);
+ return -1;
+ }
+ }
+ CloseHandle(handle);
+ return 0;
+}
+
+#endif
+
/*
* pgrename
*/
@@ -64,7 +174,7 @@ pgrename(const char *from, const char *to)
* and blocking other backends.
*/
#if defined(WIN32) && !defined(__CYGWIN__)
- while (!MoveFileEx(from, to, MOVEFILE_REPLACE_EXISTING))
+ while (pgwin32_posix_rename(from, to) < 0)
#else
while (rename(from, to) < 0)
#endif
@@ -98,70 +208,75 @@ pgrename(const char *from, const char *to)
return 0;
}
-/*
- * Check if _pglstat64()'s reason for failure was STATUS_DELETE_PENDING.
- * This doesn't apply to Cygwin, which has its own lstat() that would report
- * the case as EACCES.
-*/
-static bool
-lstat_error_was_status_delete_pending(void)
-{
- if (errno != ENOENT)
- return false;
#if defined(WIN32) && !defined(__CYGWIN__)
- if (pg_RtlGetLastNtStatus() == STATUS_DELETE_PENDING)
- return true;
-#endif
- return false;
+
+static int
+pgwin32_posix_unlink(const char *path)
+{
+ BY_HANDLE_FILE_INFORMATION info;
+ HANDLE handle;
+ ULONG flags;
+
+ flags = FILE_DISPOSITION_DELETE | FILE_DISPOSITION_POSIX_SEMANTICS;
+ handle = CreateFile(path,
+ GENERIC_READ | GENERIC_WRITE | DELETE,
+ FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
+ NULL,
+ OPEN_EXISTING,
+ FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT,
+ NULL);
+ if (handle == INVALID_HANDLE_VALUE)
+ {
+ _dosmaperr(GetLastError());
+ return -1;
+ }
+ if (!GetFileInformationByHandle(handle, &info))
+ {
+ _dosmaperr(GetLastError());
+ CloseHandle(handle);
+ return -1;
+ }
+ /* Let junction points be unlinked this way, but not directories. */
+ if ((info.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) &&
+ !(info.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT))
+ {
+ CloseHandle(handle);
+ errno = EPERM;
+ return -1;
+ }
+ if (!SetFileInformationByHandle(handle,
+ XXX_FileDispositionInfoEx,
+ &flags,
+ sizeof(flags)))
+ {
+ _dosmaperr(GetLastError());
+
+ if (errno == EINVAL)
+ {
+ /*
+ * SMB filesystems fail like this. Fall back to (presumably)
+ * non-POSIX variant via C library.
+ */
+ CloseHandle(handle);
+ return unlink(path);
+ }
+
+ CloseHandle(handle);
+ return -1;
+ }
+ CloseHandle(handle);
+ return 0;
}
+#endif
+
/*
* pgunlink
*/
int
pgunlink(const char *path)
{
- bool is_lnk;
int loops = 0;
- struct stat st;
-
- /*
- * This function might be called for a regular file or for a junction
- * point (which we use to emulate symlinks). The latter must be unlinked
- * with rmdir() on Windows. Before we worry about any of that, let's see
- * if we can unlink directly, since that's expected to be the most common
- * case.
- */
- if (unlink(path) == 0)
- return 0;
- if (errno != EACCES)
- return -1;
-
- /*
- * EACCES is reported for many reasons including unlink() of a junction
- * point. Check if that's the case so we can redirect to rmdir().
- *
- * Note that by checking only once, we can't cope with a path that changes
- * from regular file to junction point underneath us while we're retrying
- * due to sharing violations, but that seems unlikely. We could perhaps
- * prevent that by holding a file handle ourselves across the lstat() and
- * the retry loop, but that seems like over-engineering for now.
- *
- * In the special case of a STATUS_DELETE_PENDING error (file already
- * unlinked, but someone still has it open), we don't want to report ENOENT
- * to the caller immediately, because rmdir(parent) would probably fail.
- * We want to wait until the file truly goes away so that simple recursive
- * directory unlink algorithms work.
- */
- if (lstat(path, &st) < 0)
- {
- if (lstat_error_was_status_delete_pending())
- is_lnk = false;
- else
- return -1;
- }
- else
- is_lnk = S_ISLNK(st.st_mode);
/*
* We need to loop because even though PostgreSQL uses flags that allow
@@ -170,7 +285,11 @@ pgunlink(const char *path)
* someone else to close the file, as the caller might be holding locks
* and blocking other backends.
*/
- while ((is_lnk ? rmdir(path) : unlink(path)) < 0)
+#ifdef WIN32
+ while (pgwin32_posix_unlink(path) < 0)
+#else
+ while (unlink(path) < 0)
+#endif
{
if (errno != EACCES)
return -1;
diff --git a/src/port/t/001_filesystem.c b/src/port/t/001_filesystem.c
index 99bfd6bdd5..fd842c55d1 100644
--- a/src/port/t/001_filesystem.c
+++ b/src/port/t/001_filesystem.c
@@ -3,7 +3,7 @@
* Windows.
*
* Currently, have_posix_unlink_semantics is expected to be true on all Unix
- * systems and some Windows 10-based operatings using NTFS, and false on other
+ * systems and all Windows 10-based operatings using NTFS, and false on other
* Windows ReFS and SMB filesystems.
*/
@@ -442,10 +442,10 @@ filesystem_metadata_tests(void)
* Linux), though AIX/JFS1 is rumored to succeed. However, our Windows
* emulation doesn't allow it, because we want to avoid surprises by
* behaving like nearly all Unix systems. So we check this on Windows
- * only, where it fails with non-standard EACCES.
+ * only, where our wrapper fails with EPERM.
*/
PG_EXPECT_SYS(unlink(path2) == -1, "Windows: can't unlink() a directory");
- PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_EQ(errno, EPERM);
#endif
#ifdef WIN32
@@ -539,22 +539,23 @@ filesystem_metadata_tests(void)
fd = open(path2, O_RDWR | PG_BINARY, 0777);
PG_EXPECT_SYS(fd >= 0, "open name2.txt");
make_path(path2, "name2.txt");
-#ifdef WIN32
- /*
- * Windows can't rename over an open non-unlinked file, even with
- * have_posix_unlink_semantics.
- */
- pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
- PG_EXPECT_SYS(rename(path, path2) == -1,
- "Windows: can't rename name1.txt -> name2.txt while name2.txt is open");
- PG_EXPECT_EQ(errno, EACCES);
- PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
-#else
- PG_EXPECT_SYS(rename(path, path2) == 0,
- "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+ if (!have_posix_unlink_semantics)
+ {
+#ifdef WIN32
+ pgwin32_dirmod_loops = 2; /* minimize looping to fail fast in testing */
#endif
- PG_EXPECT_SYS(close(fd) == 0);
+ PG_EXPECT_SYS(rename(path, path2) == -1,
+ "Windows non-POSIX: can't rename name1.txt -> name2.txt while name2.txt is open");
+ PG_EXPECT_EQ(errno, EACCES);
+ PG_EXPECT_SYS(unlink(path) == 0, "unlink name1.txt");
+ }
+ else
+ {
+ PG_EXPECT_SYS(rename(path, path2) == 0,
+ "POSIX: can rename name1.txt -> name2.txt while name2.txt is open");
+ }
+ PG_REQUIRE_SYS(fd < 0 || close(fd) == 0);
make_path(path, "name1.txt");
fd = open(path, O_CREAT | O_EXCL | O_RDWR | PG_BINARY, 0777);
--
2.35.1
2022年10月25日(火) 13:12 Thomas Munro <thomas.munro@gmail.com>:
Show quoted text
I pushed the bug fixes from this series, without their accompanying
tests. Here's a rebase of the test suite, with all those tests now
squashed into the main test patch, and also the
tell-Windows-to-be-more-like-Unix patch. Registered in the
commitfest.
2022年10月25日(火) 13:12 Thomas Munro <thomas.munro@gmail.com>:
I pushed the bug fixes from this series, without their accompanying
tests. Here's a rebase of the test suite, with all those tests now
squashed into the main test patch, and also the
tell-Windows-to-be-more-like-Unix patch. Registered in the
commitfest.
For reference: https://commitfest.postgresql.org/40/3951/
For my understanding, does this entry supersede the proposal in
https://commitfest.postgresql.org/40/3347/ ?
(Apologies for the previous mail with no additional content).
Regards
Ian Barwick
On Mon, Nov 28, 2022 at 8:58 PM Ian Lawrence Barwick <barwick@gmail.com> wrote:
For my understanding, does this entry supersede the proposal in
https://commitfest.postgresql.org/40/3347/ ?
I think so (Victor hasn't commented). Patch 0004 derives from
Victor's patch and has him as primary author still, but I made some
changes:
* remove obsolete version check code
* provide fallback code for systems where it doesn't work (after some
research to determine that there are such systems, and what they do)
* test that it's really more POSIX-like and demonstrate what that
means (building on 0003)
Patch 0003 is a set of file system semantics tests that work on Unix,
but also exercise those src/port/*.c wrappers on Windows and show
differences from Unix semantics. Some of these tests also verify
various bugfixes already committed, so they've been pretty useful to
me already even though they aren't in the tree yet.
Patches 0001 and 0002 are generic, unrelated to this Windows stuff,
and provide a simple way to write unit tests for small bits of C code
without a whole PostgreSQL server. That's something that has been
proposed in the abstract many times before by many people. Here I've
tried to be minimalist about it, just what I needed for the
higher-numbered patches, building on existing technologies (TAP).
On Tue, 25 Oct 2022 at 09:42, Thomas Munro <thomas.munro@gmail.com> wrote:
I pushed the bug fixes from this series, without their accompanying
tests. Here's a rebase of the test suite, with all those tests now
squashed into the main test patch, and also the
tell-Windows-to-be-more-like-Unix patch. Registered in the
commitfest.
The patch does not apply on top of HEAD as in [1]http://cfbot.cputube.org/patch_41_3951.log, please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
5212d447fa53518458cbe609092b347803a667c5 ===
=== applying patch
./v3-0002-meson-Add-infrastructure-for-TAP-tests-written-in.patch
patching file meson.build
Hunk #5 FAILED at 3000.
Hunk #6 FAILED at 3035.
2 out of 6 hunks FAILED -- saving rejects to file meson.build.rej
[1]: http://cfbot.cputube.org/patch_41_3951.log
Regards,
Vignesh
On Thu, Jan 5, 2023 at 1:06 AM vignesh C <vignesh21@gmail.com> wrote:
On Tue, 25 Oct 2022 at 09:42, Thomas Munro <thomas.munro@gmail.com> wrote:
I pushed the bug fixes from this series, without their accompanying
tests. Here's a rebase of the test suite, with all those tests now
squashed into the main test patch, and also the
tell-Windows-to-be-more-like-Unix patch. Registered in the
commitfest.The patch does not apply ...
I think this exercise was (if I say so myself) quite useful, to
understand the Windows file system landscape. Maybe the things we
figured out by testing are common knowledge to real Windows
programmers, I dunno, but they were certainly all news to me and not
documented anywhere I could find, and the knowledge and tests will
probably help in future battles against Windows. The most important
things discovered were:
1. If you're testing on a Windows VM or laptop running 10 or 11 *you
aren't seeing the same behaviour as Windows Server*. So the semantics
don't match real production PostgreSQL deployments.
2. If we decided to turn on the new POSIX unlink semantics
explicitly as originally proposed by Victor, we'd get the behaviour we
really want on NTFS on all known Windows versions. But that would
move the traditional behaviour into a blind spot that we have no
testing for: ReFS and SMB. Our tree would probably gain more stuff
that doesn't work on them, so that would be tantamount to dropping
support.
Therefore, with regret, I'm going to withdraw this for now. We'd need
to get CI testing for ReFS and/or SMB first, which could be arranged,
but even then, what is the point of POSIX semantics if you don't have
them everywhere? You can't even remove any code! Unless we could
reach consensus that "PostgreSQL is not supported on SMB or ReFS until
they gain POSIX semantics" [which may never happen for all we know],
and then commit this patch and forget about non-POSIX unlink semantics
forever. I don't see us doing that in a hurry. So there's not much
hope for this idea in this commitfest.
The little C TAP framework could definitely be useful as a starting
point for something else, and the FS semantics test will definitely
come in handy if this topic is reopened by some of those potential
actions or needed to debug existing behaviour, and then I might even
re-propose parts of it, but it's all here in the archives anyway.