pg_preadv() and pg_pwritev()

Started by Thomas Munroabout 5 years ago39 messages
#1Thomas Munro
thomas.munro@gmail.com
1 attachment(s)

Hello hackers,

I want to be able to do synchronous vectored file I/O, so I made
wrapper macros for preadv() and pwritev() with fallbacks for systems
that don't have them. Following the precedent of the pg_pread() and
pg_pwrite() macros, the "pg_" prefix reflects a subtle contract
change: the fallback paths might have the side effect of changing the
file position.

They're non-standard system calls, but the BSDs and Linux have had
them for a long time, and for other systems we can use POSIX
readv()/writev() with an additional lseek(). The worst case is
Windows (and maybe our favourite antique Unix build farm animal?)
which has none of those things, so there is a further fallback to a
loop. Windows does have ReadFileScatter() and WriteFileGather(), but
those only work for overlapped (= asynchronous), unbuffered, page
aligned access. They'll very likely be useful for native AIO+DIO
support in the future, but don't fit the bill here.

This is part of a project to consolidate and offload I/O (about which
more soon), but seemed isolated enough to post separately and I guess
it could be independently useful.

Attachments:

0007-Add-pg_preadv-and-pg_pwritev.patchapplication/x-patch; name=0007-Add-pg_preadv-and-pg_pwritev.patchDownload
From 4f4e26d0e2c023d62865365e3cb5ce1c89e63cb0 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 26 Nov 2020 15:48:31 +1300
Subject: [PATCH 07/11] Add pg_preadv() and pg_pwritev().

Provide synchronous scatter/gather I/O routines.  These map to preadv(),
pwritev() with various fallbacks for systems that don't have them.
---
 configure                  | 30 ++--------------------------
 configure.ac               |  9 +++++++--
 src/include/pg_config.h.in | 15 ++++++++++++++
 src/include/port.h         | 37 ++++++++++++++++++++++++++++++++++
 src/port/Makefile          |  2 ++
 src/port/pread.c           | 41 ++++++++++++++++++++++++++++++++++++--
 src/port/pwrite.c          | 41 ++++++++++++++++++++++++++++++++++++--
 src/tools/msvc/Solution.pm |  5 +++++
 8 files changed, 146 insertions(+), 34 deletions(-)

diff --git a/configure b/configure
index cc2089e596..d25b8f2900 100755
--- a/configure
+++ b/configure
@@ -13199,7 +13199,7 @@ $as_echo "#define HAVE_STDBOOL_H 1" >>confdefs.h
 fi
 
 
-for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/un.h termios.h ucred.h wctype.h
+for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/uio.h sys/un.h termios.h ucred.h wctype.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
@@ -15293,7 +15293,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pstat pthread_is_threaded_np readlink setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l
+for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pread preadv pstat pthread_is_threaded_np pwrite pwritev readlink readv setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l writev
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
@@ -15970,32 +15970,6 @@ esac
 
 fi
 
-ac_fn_c_check_func "$LINENO" "pread" "ac_cv_func_pread"
-if test "x$ac_cv_func_pread" = xyes; then :
-  $as_echo "#define HAVE_PREAD 1" >>confdefs.h
-
-else
-  case " $LIBOBJS " in
-  *" pread.$ac_objext "* ) ;;
-  *) LIBOBJS="$LIBOBJS pread.$ac_objext"
- ;;
-esac
-
-fi
-
-ac_fn_c_check_func "$LINENO" "pwrite" "ac_cv_func_pwrite"
-if test "x$ac_cv_func_pwrite" = xyes; then :
-  $as_echo "#define HAVE_PWRITE 1" >>confdefs.h
-
-else
-  case " $LIBOBJS " in
-  *" pwrite.$ac_objext "* ) ;;
-  *) LIBOBJS="$LIBOBJS pwrite.$ac_objext"
- ;;
-esac
-
-fi
-
 ac_fn_c_check_func "$LINENO" "random" "ac_cv_func_random"
 if test "x$ac_cv_func_random" = xyes; then :
   $as_echo "#define HAVE_RANDOM 1" >>confdefs.h
diff --git a/configure.ac b/configure.ac
index c819aeab26..9c21e46ec8 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1342,6 +1342,7 @@ AC_CHECK_HEADERS(m4_normalize([
 	sys/shm.h
 	sys/sockio.h
 	sys/tas.h
+	sys/uio.h
 	sys/un.h
 	termios.h
 	ucred.h
@@ -1671,9 +1672,14 @@ AC_CHECK_FUNCS(m4_normalize([
 	poll
 	posix_fallocate
 	ppoll
+	pread
+	preadv
 	pstat
 	pthread_is_threaded_np
+	pwrite
+	pwritev
 	readlink
+	readv
 	setproctitle
 	setproctitle_fast
 	setsid
@@ -1684,6 +1690,7 @@ AC_CHECK_FUNCS(m4_normalize([
 	sync_file_range
 	uselocale
 	wcstombs_l
+	writev
 ]))
 
 # These typically are compiler builtins, for which AC_CHECK_FUNCS fails.
@@ -1744,8 +1751,6 @@ AC_REPLACE_FUNCS(m4_normalize([
 	inet_aton
 	link
 	mkdtemp
-	pread
-	pwrite
 	random
 	srandom
 	strlcat
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 465df421b0..18a68c21ac 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -412,6 +412,9 @@
 /* Define to 1 if you have the `pread' function. */
 #undef HAVE_PREAD
 
+/* Define to 1 if you have the `preadv' function. */
+#undef HAVE_PREADV
+
 /* Define to 1 if you have the `pstat' function. */
 #undef HAVE_PSTAT
 
@@ -430,6 +433,9 @@
 /* Define to 1 if you have the `pwrite' function. */
 #undef HAVE_PWRITE
 
+/* Define to 1 if you have the `pwritev' function. */
+#undef HAVE_PWRITEV
+
 /* Define to 1 if you have the `random' function. */
 #undef HAVE_RANDOM
 
@@ -445,6 +451,9 @@
 /* Define to 1 if you have the `readlink' function. */
 #undef HAVE_READLINK
 
+/* Define to 1 if you have the `readv' function. */
+#undef HAVE_READV
+
 /* Define to 1 if you have the global variable
    'rl_completion_append_character'. */
 #undef HAVE_RL_COMPLETION_APPEND_CHARACTER
@@ -626,6 +635,9 @@
 /* Define to 1 if you have the <sys/ucred.h> header file. */
 #undef HAVE_SYS_UCRED_H
 
+/* Define to 1 if you have the <sys/uio.h> header file. */
+#undef HAVE_SYS_UIO_H
+
 /* Define to 1 if you have the <sys/un.h> header file. */
 #undef HAVE_SYS_UN_H
 
@@ -680,6 +692,9 @@
 /* Define to 1 if you have the <winldap.h> header file. */
 #undef HAVE_WINLDAP_H
 
+/* Define to 1 if you have the `writev' function. */
+#undef HAVE_WRITEV
+
 /* Define to 1 if you have the `X509_get_signature_nid' function. */
 #undef HAVE_X509_GET_SIGNATURE_NID
 
diff --git a/src/include/port.h b/src/include/port.h
index 5dfb00b07c..70cd1d5287 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -17,6 +17,10 @@
 #include <netdb.h>
 #include <pwd.h>
 
+#ifdef HAVE_SYS_UIO_H
+#include <sys/uio.h>
+#endif
+
 /*
  * Windows has enough specialized port stuff that we push most of it off
  * into another file.
@@ -431,6 +435,39 @@ extern ssize_t pg_pread(int fd, void *buf, size_t nbyte, off_t offset);
 extern ssize_t pg_pwrite(int fd, const void *buf, size_t nbyte, off_t offset);
 #endif
 
+/*
+ * Likewise for the non-standard but common extensions preadv(2) and
+ * pwritev(2).  If sys/uio.h is also missing, define a POSIX-compatible iovec
+ * struct here.
+ */
+#ifndef HAVE_SYS_UIO_H
+struct iovec
+{
+	void	   *iov_base;
+	size_t		iov_len;
+};
+#endif
+
+/*
+ * If <limits.h> didn't define IOV_MAX, we'll define our own for the purposes
+ * of our replacement implementations.
+ */
+#ifndef IOV_MAX
+#define IOV_MAX 16
+#endif
+
+#ifdef HAVE_PREADV
+#define pg_preadv preadv
+#else
+extern ssize_t pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset);
+#endif
+
+#ifdef HAVE_PWRITEV
+#define pg_pwritev pwritev
+#else
+extern ssize_t pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset);
+#endif
+
 #if !HAVE_DECL_STRLCAT
 extern size_t strlcat(char *dst, const char *src, size_t siz);
 #endif
diff --git a/src/port/Makefile b/src/port/Makefile
index e41b005c4f..bc4923ce84 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -53,6 +53,8 @@ OBJS = \
 	pgstrcasecmp.o \
 	pgstrsignal.o \
 	pqsignal.o \
+	pread.o \
+	pwrite.o \
 	qsort.o \
 	qsort_arg.o \
 	quotes.o \
diff --git a/src/port/pread.c b/src/port/pread.c
index e7fecebe5f..4704ebbe7f 100644
--- a/src/port/pread.c
+++ b/src/port/pread.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pread.c
- *	  Implementation of pread(2) for platforms that lack one.
+ *	  Implementation of pread[v](2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  *
@@ -9,7 +9,8 @@
  *	  src/port/pread.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pread().
+ * the POSIX function, so we use the name pg_pread().  Likewise for the
+ * iovec version.
  *
  *-------------------------------------------------------------------------
  */
@@ -23,6 +24,7 @@
 #include <unistd.h>
 #endif
 
+#ifndef HAVE_PREAD
 ssize_t
 pg_pread(int fd, void *buf, size_t size, off_t offset)
 {
@@ -56,3 +58,38 @@ pg_pread(int fd, void *buf, size_t size, off_t offset)
 	return read(fd, buf, size);
 #endif
 }
+#endif
+
+#ifndef HAVE_PREADV
+ssize_t
+pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_READV
+	if (iovcnt == 1)
+		return pg_pread(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return readv(fd, iov, iovcnt);
+#else
+	ssize_t 	sum = 0;
+	ssize_t 	part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pread(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
+#endif
diff --git a/src/port/pwrite.c b/src/port/pwrite.c
index 2e0c154462..ed9897e882 100644
--- a/src/port/pwrite.c
+++ b/src/port/pwrite.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pwrite.c
- *	  Implementation of pwrite(2) for platforms that lack one.
+ *	  Implementation of pwrite[v](2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  *
@@ -9,7 +9,8 @@
  *	  src/port/pwrite.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pwrite().
+ * the POSIX function, so we use the name pg_pwrite().  Likewise for the
+ * iovec version.
  *
  *-------------------------------------------------------------------------
  */
@@ -23,6 +24,7 @@
 #include <unistd.h>
 #endif
 
+#ifndef HAVE_PWRITE
 ssize_t
 pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 {
@@ -53,3 +55,38 @@ pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 	return write(fd, buf, size);
 #endif
 }
+#endif
+
+#ifndef HAVE_PWRITEV
+ssize_t
+pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_WRITEV
+	if (iovcnt == 1)
+		return pg_pwrite(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return writev(fd, iov, iovcnt);
+#else
+	ssize_t 	sum = 0;
+	ssize_t 	part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pwrite(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
+#endif
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 22d6abd367..832a363282 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -329,17 +329,20 @@ sub GenerateFiles
 		HAVE_PPC_LWARX_MUTEX_HINT   => undef,
 		HAVE_PPOLL                  => undef,
 		HAVE_PREAD                  => undef,
+		HAVE_PREADV                 => undef,
 		HAVE_PSTAT                  => undef,
 		HAVE_PS_STRINGS             => undef,
 		HAVE_PTHREAD                => undef,
 		HAVE_PTHREAD_IS_THREADED_NP => undef,
 		HAVE_PTHREAD_PRIO_INHERIT   => undef,
 		HAVE_PWRITE                 => undef,
+		HAVE_PWRITEV                => undef,
 		HAVE_RANDOM                 => undef,
 		HAVE_READLINE_H             => undef,
 		HAVE_READLINE_HISTORY_H     => undef,
 		HAVE_READLINE_READLINE_H    => undef,
 		HAVE_READLINK               => undef,
+		HAVE_READV                  => undef,
 		HAVE_RL_COMPLETION_APPEND_CHARACTER      => undef,
 		HAVE_RL_COMPLETION_MATCHES               => undef,
 		HAVE_RL_COMPLETION_SUPPRESS_QUOTE        => undef,
@@ -400,6 +403,7 @@ sub GenerateFiles
 		HAVE_SYS_TYPES_H                         => 1,
 		HAVE_SYS_UCRED_H                         => undef,
 		HAVE_SYS_UN_H                            => undef,
+		HAVE_SYS_UIO_H                           => undef,
 		HAVE_TERMIOS_H                           => undef,
 		HAVE_TYPEOF                              => undef,
 		HAVE_UCRED_H                             => undef,
@@ -417,6 +421,7 @@ sub GenerateFiles
 		HAVE_WINLDAP_H                           => undef,
 		HAVE_WCSTOMBS_L                          => 1,
 		HAVE_WCTYPE_H                            => 1,
+		HAVE_WRITEV                              => undef,
 		HAVE_X509_GET_SIGNATURE_NID              => 1,
 		HAVE_X86_64_POPCNTQ                      => undef,
 		HAVE__BOOL                               => undef,
-- 
2.20.1

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#1)
Re: pg_preadv() and pg_pwritev()

Thomas Munro <thomas.munro@gmail.com> writes:

I want to be able to do synchronous vectored file I/O, so I made
wrapper macros for preadv() and pwritev() with fallbacks for systems
that don't have them. Following the precedent of the pg_pread() and
pg_pwrite() macros, the "pg_" prefix reflects a subtle contract
change: the fallback paths might have the side effect of changing the
file position.

In a quick look, seems OK with some nits:

1. port.h cannot assume that <limits.h> has already been included;
nor do I want to fix that by including <limits.h> there. Do we
really need to define a fallback value of IOV_MAX? If so,
maybe the answer is to put the replacement struct iovec and
IOV_MAX in some new header.

2. I'm not really that happy about loading <sys/uio.h> into
every compilation we do, which would be another reason for a
new specialized header that either includes <sys/uio.h> or
provides fallback definitions.

3. The patch as given won't prove anything except that the code
compiles. Is it worth fixing at least one code path to make
use of pg_preadv and pg_pwritev, so we can make sure this code
is tested before there's layers of other new code on top?

regards, tom lane

#3Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#2)
2 attachment(s)
Re: pg_preadv() and pg_pwritev()

On Sun, Dec 20, 2020 at 12:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

I want to be able to do synchronous vectored file I/O, so I made
wrapper macros for preadv() and pwritev() with fallbacks for systems
that don't have them. Following the precedent of the pg_pread() and
pg_pwrite() macros, the "pg_" prefix reflects a subtle contract
change: the fallback paths might have the side effect of changing the
file position.

In a quick look, seems OK with some nits:

Thanks for looking!

1. port.h cannot assume that <limits.h> has already been included;
nor do I want to fix that by including <limits.h> there. Do we
really need to define a fallback value of IOV_MAX? If so,
maybe the answer is to put the replacement struct iovec and
IOV_MAX in some new header.

Ok, I moved all this stuff into port/pg_uio.h.

2. I'm not really that happy about loading <sys/uio.h> into
every compilation we do, which would be another reason for a
new specialized header that either includes <sys/uio.h> or
provides fallback definitions.

Ack.

3. The patch as given won't prove anything except that the code
compiles. Is it worth fixing at least one code path to make
use of pg_preadv and pg_pwritev, so we can make sure this code
is tested before there's layers of other new code on top?

OK, here's a patch to zero-fill fresh WAL segments with pwritev().

I'm drawing a blank on trivial candidate uses for preadv(), without
infrastructure from later patches.

Attachments:

v2-0001-Add-pg_preadv-and-pg_pwritev.patchapplication/x-patch; name=v2-0001-Add-pg_preadv-and-pg_pwritev.patchDownload
From d7178f296642e177bc57fabe93abec395cbaac5f Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 26 Nov 2020 15:48:31 +1300
Subject: [PATCH v2 1/2] Add pg_preadv() and pg_pwritev().

Provide synchronous scatter/gather I/O routines.  These map to preadv(),
pwritev() with various fallbacks for systems that don't have them.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
---
 configure                  | 30 ++--------------------
 configure.ac               |  9 +++++--
 src/include/pg_config.h.in | 15 +++++++++++
 src/include/port.h         |  2 ++
 src/include/port/pg_uio.h  | 51 ++++++++++++++++++++++++++++++++++++++
 src/port/Makefile          |  2 ++
 src/port/pread.c           | 43 ++++++++++++++++++++++++++++++--
 src/port/pwrite.c          | 43 ++++++++++++++++++++++++++++++--
 src/tools/msvc/Solution.pm |  5 ++++
 9 files changed, 166 insertions(+), 34 deletions(-)
 create mode 100644 src/include/port/pg_uio.h

diff --git a/configure b/configure
index 11a4284e5b..5887c712cc 100755
--- a/configure
+++ b/configure
@@ -13061,7 +13061,7 @@ $as_echo "#define HAVE_STDBOOL_H 1" >>confdefs.h
 fi
 
 
-for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/un.h termios.h ucred.h wctype.h
+for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/uio.h sys/un.h termios.h ucred.h wctype.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
@@ -15155,7 +15155,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pstat pthread_is_threaded_np readlink setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l
+for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pread preadv pstat pthread_is_threaded_np pwrite pwritev readlink readv setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l writev
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
@@ -15832,32 +15832,6 @@ esac
 
 fi
 
-ac_fn_c_check_func "$LINENO" "pread" "ac_cv_func_pread"
-if test "x$ac_cv_func_pread" = xyes; then :
-  $as_echo "#define HAVE_PREAD 1" >>confdefs.h
-
-else
-  case " $LIBOBJS " in
-  *" pread.$ac_objext "* ) ;;
-  *) LIBOBJS="$LIBOBJS pread.$ac_objext"
- ;;
-esac
-
-fi
-
-ac_fn_c_check_func "$LINENO" "pwrite" "ac_cv_func_pwrite"
-if test "x$ac_cv_func_pwrite" = xyes; then :
-  $as_echo "#define HAVE_PWRITE 1" >>confdefs.h
-
-else
-  case " $LIBOBJS " in
-  *" pwrite.$ac_objext "* ) ;;
-  *) LIBOBJS="$LIBOBJS pwrite.$ac_objext"
- ;;
-esac
-
-fi
-
 ac_fn_c_check_func "$LINENO" "random" "ac_cv_func_random"
 if test "x$ac_cv_func_random" = xyes; then :
   $as_echo "#define HAVE_RANDOM 1" >>confdefs.h
diff --git a/configure.ac b/configure.ac
index fc523c6aeb..c930463225 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1331,6 +1331,7 @@ AC_CHECK_HEADERS(m4_normalize([
 	sys/shm.h
 	sys/sockio.h
 	sys/tas.h
+	sys/uio.h
 	sys/un.h
 	termios.h
 	ucred.h
@@ -1660,9 +1661,14 @@ AC_CHECK_FUNCS(m4_normalize([
 	poll
 	posix_fallocate
 	ppoll
+	pread
+	preadv
 	pstat
 	pthread_is_threaded_np
+	pwrite
+	pwritev
 	readlink
+	readv
 	setproctitle
 	setproctitle_fast
 	setsid
@@ -1673,6 +1679,7 @@ AC_CHECK_FUNCS(m4_normalize([
 	sync_file_range
 	uselocale
 	wcstombs_l
+	writev
 ]))
 
 # These typically are compiler builtins, for which AC_CHECK_FUNCS fails.
@@ -1733,8 +1740,6 @@ AC_REPLACE_FUNCS(m4_normalize([
 	inet_aton
 	link
 	mkdtemp
-	pread
-	pwrite
 	random
 	srandom
 	strlcat
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index de8f838e53..582cbfa012 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -412,6 +412,9 @@
 /* Define to 1 if you have the `pread' function. */
 #undef HAVE_PREAD
 
+/* Define to 1 if you have the `preadv' function. */
+#undef HAVE_PREADV
+
 /* Define to 1 if you have the `pstat' function. */
 #undef HAVE_PSTAT
 
@@ -430,6 +433,9 @@
 /* Define to 1 if you have the `pwrite' function. */
 #undef HAVE_PWRITE
 
+/* Define to 1 if you have the `pwritev' function. */
+#undef HAVE_PWRITEV
+
 /* Define to 1 if you have the `random' function. */
 #undef HAVE_RANDOM
 
@@ -445,6 +451,9 @@
 /* Define to 1 if you have the `readlink' function. */
 #undef HAVE_READLINK
 
+/* Define to 1 if you have the `readv' function. */
+#undef HAVE_READV
+
 /* Define to 1 if you have the global variable
    'rl_completion_append_character'. */
 #undef HAVE_RL_COMPLETION_APPEND_CHARACTER
@@ -626,6 +635,9 @@
 /* Define to 1 if you have the <sys/ucred.h> header file. */
 #undef HAVE_SYS_UCRED_H
 
+/* Define to 1 if you have the <sys/uio.h> header file. */
+#undef HAVE_SYS_UIO_H
+
 /* Define to 1 if you have the <sys/un.h> header file. */
 #undef HAVE_SYS_UN_H
 
@@ -680,6 +692,9 @@
 /* Define to 1 if you have the <winldap.h> header file. */
 #undef HAVE_WINLDAP_H
 
+/* Define to 1 if you have the `writev' function. */
+#undef HAVE_WRITEV
+
 /* Define to 1 if you have the `X509_get_signature_nid' function. */
 #undef HAVE_X509_GET_SIGNATURE_NID
 
diff --git a/src/include/port.h b/src/include/port.h
index 5dfb00b07c..87b20518c3 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -431,6 +431,8 @@ extern ssize_t pg_pread(int fd, void *buf, size_t nbyte, off_t offset);
 extern ssize_t pg_pwrite(int fd, const void *buf, size_t nbyte, off_t offset);
 #endif
 
+/* For pg_pwritev() and pg_preadv(), see port/pg_uio.h. */
+
 #if !HAVE_DECL_STRLCAT
 extern size_t strlcat(char *dst, const char *src, size_t siz);
 #endif
diff --git a/src/include/port/pg_uio.h b/src/include/port/pg_uio.h
new file mode 100644
index 0000000000..2f355c9fc2
--- /dev/null
+++ b/src/include/port/pg_uio.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_uio.h
+ *	  Header for src/port/p{read,write}v.c.
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/pg_uio.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_UIO_H
+#define PG_UIO_H
+
+#include <limits.h>
+
+#ifdef HAVE_SYS_UIO_H
+#include <sys/uio.h>
+#endif
+
+/* If <sys/uio.h> is missing, define our own POSIX-compatible iovec struct. */
+#ifndef HAVE_SYS_UIO_H
+struct iovec
+{
+	void	   *iov_base;
+	size_t		iov_len;
+};
+#endif
+
+/*
+ * If <limits.h> didn't define IOV_MAX, define our own.  POSIX requires at
+ * least 16.
+ */
+#ifndef IOV_MAX
+#define IOV_MAX 16
+#endif
+
+#ifdef HAVE_PREADV
+#define pg_preadv preadv
+#else
+extern ssize_t pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset);
+#endif
+
+#ifdef HAVE_PWRITEV
+#define pg_pwritev pwritev
+#else
+extern ssize_t pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset);
+#endif
+
+#endif							/* PG_UIO_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index e41b005c4f..bc4923ce84 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -53,6 +53,8 @@ OBJS = \
 	pgstrcasecmp.o \
 	pgstrsignal.o \
 	pqsignal.o \
+	pread.o \
+	pwrite.o \
 	qsort.o \
 	qsort_arg.o \
 	quotes.o \
diff --git a/src/port/pread.c b/src/port/pread.c
index e7fecebe5f..6693c487f3 100644
--- a/src/port/pread.c
+++ b/src/port/pread.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pread.c
- *	  Implementation of pread(2) for platforms that lack one.
+ *	  Implementation of pread[v](2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  *
@@ -9,7 +9,8 @@
  *	  src/port/pread.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pread().
+ * the POSIX function, so we use the name pg_pread().  Likewise for the
+ * iovec version.
  *
  *-------------------------------------------------------------------------
  */
@@ -23,6 +24,9 @@
 #include <unistd.h>
 #endif
 
+#include "port/pg_uio.h"
+
+#ifndef HAVE_PREAD
 ssize_t
 pg_pread(int fd, void *buf, size_t size, off_t offset)
 {
@@ -56,3 +60,38 @@ pg_pread(int fd, void *buf, size_t size, off_t offset)
 	return read(fd, buf, size);
 #endif
 }
+#endif
+
+#ifndef HAVE_PREADV
+ssize_t
+pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_READV
+	if (iovcnt == 1)
+		return pg_pread(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return readv(fd, iov, iovcnt);
+#else
+	ssize_t 	sum = 0;
+	ssize_t 	part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pread(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
+#endif
diff --git a/src/port/pwrite.c b/src/port/pwrite.c
index 2e0c154462..5240641e13 100644
--- a/src/port/pwrite.c
+++ b/src/port/pwrite.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pwrite.c
- *	  Implementation of pwrite(2) for platforms that lack one.
+ *	  Implementation of pwrite[v](2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  *
@@ -9,7 +9,8 @@
  *	  src/port/pwrite.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pwrite().
+ * the POSIX function, so we use the name pg_pwrite().  Likewise for the
+ * iovec version.
  *
  *-------------------------------------------------------------------------
  */
@@ -23,6 +24,9 @@
 #include <unistd.h>
 #endif
 
+#include "port/pg_uio.h"
+
+#ifndef HAVE_PWRITE
 ssize_t
 pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 {
@@ -53,3 +57,38 @@ pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 	return write(fd, buf, size);
 #endif
 }
+#endif
+
+#ifndef HAVE_PWRITEV
+ssize_t
+pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_WRITEV
+	if (iovcnt == 1)
+		return pg_pwrite(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return writev(fd, iov, iovcnt);
+#else
+	ssize_t 	sum = 0;
+	ssize_t 	part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pwrite(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
+#endif
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 22d6abd367..832a363282 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -329,17 +329,20 @@ sub GenerateFiles
 		HAVE_PPC_LWARX_MUTEX_HINT   => undef,
 		HAVE_PPOLL                  => undef,
 		HAVE_PREAD                  => undef,
+		HAVE_PREADV                 => undef,
 		HAVE_PSTAT                  => undef,
 		HAVE_PS_STRINGS             => undef,
 		HAVE_PTHREAD                => undef,
 		HAVE_PTHREAD_IS_THREADED_NP => undef,
 		HAVE_PTHREAD_PRIO_INHERIT   => undef,
 		HAVE_PWRITE                 => undef,
+		HAVE_PWRITEV                => undef,
 		HAVE_RANDOM                 => undef,
 		HAVE_READLINE_H             => undef,
 		HAVE_READLINE_HISTORY_H     => undef,
 		HAVE_READLINE_READLINE_H    => undef,
 		HAVE_READLINK               => undef,
+		HAVE_READV                  => undef,
 		HAVE_RL_COMPLETION_APPEND_CHARACTER      => undef,
 		HAVE_RL_COMPLETION_MATCHES               => undef,
 		HAVE_RL_COMPLETION_SUPPRESS_QUOTE        => undef,
@@ -400,6 +403,7 @@ sub GenerateFiles
 		HAVE_SYS_TYPES_H                         => 1,
 		HAVE_SYS_UCRED_H                         => undef,
 		HAVE_SYS_UN_H                            => undef,
+		HAVE_SYS_UIO_H                           => undef,
 		HAVE_TERMIOS_H                           => undef,
 		HAVE_TYPEOF                              => undef,
 		HAVE_UCRED_H                             => undef,
@@ -417,6 +421,7 @@ sub GenerateFiles
 		HAVE_WINLDAP_H                           => undef,
 		HAVE_WCSTOMBS_L                          => 1,
 		HAVE_WCTYPE_H                            => 1,
+		HAVE_WRITEV                              => undef,
 		HAVE_X509_GET_SIGNATURE_NID              => 1,
 		HAVE_X86_64_POPCNTQ                      => undef,
 		HAVE__BOOL                               => undef,
-- 
2.20.1

v2-0002-Use-vectored-I-O-to-zero-WAL-segments.patchapplication/x-patch; name=v2-0002-Use-vectored-I-O-to-zero-WAL-segments.patchDownload
From c407ea9d7352a108088d3319202a8488b2b9ba7b Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 20 Dec 2020 13:16:21 +1300
Subject: [PATCH v2 2/2] Use vectored I/O to zero WAL segments.

Instead of making 2048 block-sized write() calls to fill a 16MB WAL file
with zeroes, make a much smaller number of pwritev() calls.  The actual
number depends on the OS's IOV_MAX, but it works out as 2 pwritev()
calls on Linux and FreeBSD.

Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
---
 src/backend/access/transam/xlog.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b1e5d2dbff..e27ccddec1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -48,6 +48,7 @@
 #include "pg_trace.h"
 #include "pgstat.h"
 #include "port/atomics.h"
+#include "port/pg_uio.h"
 #include "postmaster/bgwriter.h"
 #include "postmaster/startup.h"
 #include "postmaster/walwriter.h"
@@ -3270,7 +3271,6 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 	XLogSegNo	installed_segno;
 	XLogSegNo	max_segno;
 	int			fd;
-	int			nbytes;
 	int			save_errno;
 
 	XLogFilePath(path, ThisTimeLineID, logsegno, wal_segment_size);
@@ -3317,6 +3317,9 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 	save_errno = 0;
 	if (wal_init_zero)
 	{
+		struct iovec iov[Min(IOV_MAX, 1024)];	/* cap stack space */
+		int			blocks;
+
 		/*
 		 * Zero-fill the file.  With this setting, we do this the hard way to
 		 * ensure that all the file space has really been allocated.  On
@@ -3326,15 +3329,31 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 		 * indirect blocks are down on disk.  Therefore, fdatasync(2) or
 		 * O_DSYNC will be sufficient to sync future writes to the log file.
 		 */
-		for (nbytes = 0; nbytes < wal_segment_size; nbytes += XLOG_BLCKSZ)
+
+		/* Prepare to write out a lot of copies of our zero buffer at once. */
+		for (int i = 0; i < lengthof(iov); ++i)
+		{
+			iov[i].iov_base = zbuffer.data;
+			iov[i].iov_len = XLOG_BLCKSZ;
+		}
+
+		/* Loop, writing as many blocks as we can for each system call. */
+		blocks = wal_segment_size / XLOG_BLCKSZ;
+		for (int i = 0; i < blocks;)
 		{
+			int 		iovcnt = Min(blocks - i, lengthof(iov));
+			size_t		size = iovcnt * XLOG_BLCKSZ;
+			off_t		offset = i * XLOG_BLCKSZ;
+
 			errno = 0;
-			if (write(fd, zbuffer.data, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+			if (pg_pwritev(fd, iov, iovcnt, offset) != size)
 			{
 				/* if write didn't set errno, assume no disk space */
 				save_errno = errno ? errno : ENOSPC;
 				break;
 			}
+
+			i += iovcnt;
 		}
 	}
 	else
-- 
2.20.1

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#3)
Re: pg_preadv() and pg_pwritev()

Thomas Munro <thomas.munro@gmail.com> writes:

OK, here's a patch to zero-fill fresh WAL segments with pwritev().
I'm drawing a blank on trivial candidate uses for preadv(), without
infrastructure from later patches.

This looks OK to me. I tried it on prairiedog (has writev and
pwrite but not pwritev) as well as gaur (has only writev).
They seem happy.

One minor thought is that in

+ struct iovec iov[Min(IOV_MAX, 1024)]; /* cap stack space */

it seems like pretty much every use of IOV_MAX would want some
similar cap. Should we centralize that idea with, say,

#define PG_IOV_MAX Min(IOV_MAX, 1024)

? Or will the plausible cap vary across uses?

regards, tom lane

#5Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#4)
Re: pg_preadv() and pg_pwritev()

On Sun, Dec 20, 2020 at 8:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

One minor thought is that in

+ struct iovec iov[Min(IOV_MAX, 1024)]; /* cap stack space */

it seems like pretty much every use of IOV_MAX would want some
similar cap. Should we centralize that idea with, say,

#define PG_IOV_MAX Min(IOV_MAX, 1024)

? Or will the plausible cap vary across uses?

Hmm. For the real intended user of this, namely worker processes that
simulate AIO when native AIO isn't available, higher level code will
limit the iov count to much smaller numbers anyway. It wants to try
to stay under typical device limits for vectored I/O, because split
requests would confound attempts to model and limit queue depth and
control latency. In Andres's AIO prototype he currently has a macro
PGAIO_MAX_COMBINE set to 16 (meaning approximately 16 data block or
wal reads/writes = 128KB worth of scatter/gather per I/O request); I
guess it should really be Min(IOV_MAX, <something>), but I don't
currently have an opinion on the <something>, except that it should
surely be closer to 16 than 1024 (for example
/sys/block/nvme0n1/queue/max_segments is 33 here). I mention all this
to explain that I don't think the code in patch 0002 is going to turn
out to be very typical: it's trying to minimise system calls by
staying under an API limit (though I cap it for allocation sanity),
whereas more typical code probably wants to stay under a device limit,
so I don't immediately have another use for eg PG_IOV_MAX.

#6Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#3)
Re: pg_preadv() and pg_pwritev()

Hi,

On 2020-12-20 16:26:42 +1300, Thomas Munro wrote:

1. port.h cannot assume that <limits.h> has already been included;
nor do I want to fix that by including <limits.h> there. Do we
really need to define a fallback value of IOV_MAX? If so,
maybe the answer is to put the replacement struct iovec and
IOV_MAX in some new header.

Ok, I moved all this stuff into port/pg_uio.h.

Can we come up with a better name than 'uio'? I find that a not exactly
meaningful name.

Or perhaps we could just leave the functions in port/port.h, but extract
the value of the define at configure time? That way only pread.c /
pwrite.c would need it.

3. The patch as given won't prove anything except that the code
compiles. Is it worth fixing at least one code path to make
use of pg_preadv and pg_pwritev, so we can make sure this code
is tested before there's layers of other new code on top?

OK, here's a patch to zero-fill fresh WAL segments with pwritev().

I think that's a good idea. However, I see two small issues: 1) If we
write larger amounts at once, we need to handle partial writes. That's a
large enough amount of IO that it's much more likely to hit a memory
shortage or such in the kernel - we had to do that a while a go for WAL
writes (which can also be larger), if memory serves.

Perhaps we should have pg_pwritev/readv unconditionally go through
pwrite.c/pread.c and add support for "continuing" partial writes/reads
in one central place?

I'm drawing a blank on trivial candidate uses for preadv(), without
infrastructure from later patches.

Can't immediately think of something either.

Greetings,

Andres Freund

#7Jakub Wartak
Jakub.Wartak@tomtom.com
In reply to: Andres Freund (#6)
RE: pg_preadv() and pg_pwritev()

I'm drawing a blank on trivial candidate uses for preadv(), without
infrastructure from later patches.

Can't immediately think of something either.

This might be not that trivial , but maybe acquire_sample_rows() from analyze.c ?

Please note however there's patch /messages/by-id/20201109180644.GJ16415@tamriel.snowman.net ( https://commitfest.postgresql.org/30/2799/ ) for adding fadvise, but maybe those two could be even combined so you would be doing e.g. 16x fadvise() and then grab 8 pages in one preadv() call ? I'm find unlikely however that preadv give any additional performance benefit there after having fadvise, but clearly a potential place to test.

-J.

#8Thomas Munro
thomas.munro@gmail.com
In reply to: Jakub Wartak (#7)
Re: pg_preadv() and pg_pwritev()

On Mon, Dec 21, 2020 at 8:25 PM Jakub Wartak <Jakub.Wartak@tomtom.com> wrote:

I'm drawing a blank on trivial candidate uses for preadv(), without
infrastructure from later patches.

Can't immediately think of something either.

This might be not that trivial , but maybe acquire_sample_rows() from analyze.c ?

Please note however there's patch /messages/by-id/20201109180644.GJ16415@tamriel.snowman.net ( https://commitfest.postgresql.org/30/2799/ ) for adding fadvise, but maybe those two could be even combined so you would be doing e.g. 16x fadvise() and then grab 8 pages in one preadv() call ? I'm find unlikely however that preadv give any additional performance benefit there after having fadvise, but clearly a potential place to test.

Oh, interesting, that looks like another test case to study with the
AIO patch set, but I don't think it's worth trying to do a
simpler/half-baked version in the meantime. (Since that ANALYZE patch
uses PrefetchBuffer() it should automatically benefit: the
posix_fadvise() calls will be replaced with consolidated preadv()
calls in a worker process or native AIO equivalent so that system
calls are mostly gone from the initiating process, and by the time you
try to access the buffer it'll hopefully see that it's finished
without any further system calls. Refinements are possible though,
like making use of recent_buffer to avoid double-lookup, and
tuning/optimisation for how often IOs should be consolidated and
submitted.)

#9Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#6)
2 attachment(s)
Re: pg_preadv() and pg_pwritev()

On Mon, Dec 21, 2020 at 11:40 AM Andres Freund <andres@anarazel.de> wrote:

On 2020-12-20 16:26:42 +1300, Thomas Munro wrote:

1. port.h cannot assume that <limits.h> has already been included;
nor do I want to fix that by including <limits.h> there. Do we
really need to define a fallback value of IOV_MAX? If so,
maybe the answer is to put the replacement struct iovec and
IOV_MAX in some new header.

Ok, I moved all this stuff into port/pg_uio.h.

Can we come up with a better name than 'uio'? I find that a not exactly
meaningful name.

Ok, let's try port/pg_iovec.h.

Or perhaps we could just leave the functions in port/port.h, but extract
the value of the define at configure time? That way only pread.c /
pwrite.c would need it.

That won't work for the struct definition, so client code would need
to remember to do:

#ifdef HAVE_SYS_UIO_H
#include <sys/uio.h>
#endif

... which is a little tedious, or port.h would need to do that and
incur an overhead in every translation unit, which Tom objected to.
That's why I liked the separate header idea.

3. The patch as given won't prove anything except that the code
compiles. Is it worth fixing at least one code path to make
use of pg_preadv and pg_pwritev, so we can make sure this code
is tested before there's layers of other new code on top?

OK, here's a patch to zero-fill fresh WAL segments with pwritev().

I think that's a good idea. However, I see two small issues: 1) If we
write larger amounts at once, we need to handle partial writes. That's a
large enough amount of IO that it's much more likely to hit a memory
shortage or such in the kernel - we had to do that a while a go for WAL
writes (which can also be larger), if memory serves.

Perhaps we should have pg_pwritev/readv unconditionally go through
pwrite.c/pread.c and add support for "continuing" partial writes/reads
in one central place?

Ok, here's a new version with the following changes:

1. Define PG_IOV_MAX, a reasonable size for local variables, not
larger than IOV_MAX.
2 Use 32 rather than 1024, based on off-list complaint about 1024
potentially swamping the IO system unfairly.
3. Add a wrapper pg_pwritev_retry() to retry automatically on short
writes. (I didn't write pg_preadv_retry(), because I don't currently
need it for anything so I didn't want to have to think about EOF vs
short-reads-for-implementation-reasons.)
4. I considered whether pg_pwrite() should have built-in retry
instead of a separate wrapper, but I thought of an argument against
hiding the "raw" version: the AIO patch set already understands short
reads/writes and knows how to retry at a higher level, as that's
needed for native AIO too, so I think it makes sense to be able to
keep things the same and not solve the same problem twice. A counter
argument would be that you could get the retry underway sooner with a
tight loop, but I'm not expecting this to be common.

I'm drawing a blank on trivial candidate uses for preadv(), without
infrastructure from later patches.

Can't immediately think of something either.

One idea I had for the future is for xlogreader.c to read the WAL into
a large multi-page circular buffer, which could wrap around using a
pair of iovecs, but that requires lots more work .

Attachments:

v3-0001-Add-pg_preadv-and-pg_pwritev.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Add-pg_preadv-and-pg_pwritev.patchDownload
From 3da0b25b4ce0804355f912bd026ef9c9eee146f3 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 26 Nov 2020 15:48:31 +1300
Subject: [PATCH v3 1/2] Add pg_preadv() and pg_pwritev().

Provide synchronous scatter/gather I/O routines.  These map to preadv(),
pwritev() with various fallbacks for systems that don't have them.  Also
provide a wrapper pg_pwritev_retry() which automatically retries on
short write.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
---
 configure                   |  30 +---------
 configure.ac                |   9 ++-
 src/include/pg_config.h.in  |  15 +++++
 src/include/port.h          |   2 +
 src/include/port/pg_iovec.h |  57 +++++++++++++++++++
 src/port/Makefile           |   2 +
 src/port/pread.c            |  43 +++++++++++++-
 src/port/pwrite.c           | 109 +++++++++++++++++++++++++++++++++++-
 src/tools/msvc/Solution.pm  |   5 ++
 9 files changed, 238 insertions(+), 34 deletions(-)
 create mode 100644 src/include/port/pg_iovec.h

diff --git a/configure b/configure
index 11a4284e5b..5887c712cc 100755
--- a/configure
+++ b/configure
@@ -13061,7 +13061,7 @@ $as_echo "#define HAVE_STDBOOL_H 1" >>confdefs.h
 fi
 
 
-for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/un.h termios.h ucred.h wctype.h
+for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h poll.h sys/epoll.h sys/event.h sys/ipc.h sys/prctl.h sys/procctl.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/sockio.h sys/tas.h sys/uio.h sys/un.h termios.h ucred.h wctype.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
@@ -15155,7 +15155,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pstat pthread_is_threaded_np readlink setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l
+for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pread preadv pstat pthread_is_threaded_np pwrite pwritev readlink readv setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l writev
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
@@ -15832,32 +15832,6 @@ esac
 
 fi
 
-ac_fn_c_check_func "$LINENO" "pread" "ac_cv_func_pread"
-if test "x$ac_cv_func_pread" = xyes; then :
-  $as_echo "#define HAVE_PREAD 1" >>confdefs.h
-
-else
-  case " $LIBOBJS " in
-  *" pread.$ac_objext "* ) ;;
-  *) LIBOBJS="$LIBOBJS pread.$ac_objext"
- ;;
-esac
-
-fi
-
-ac_fn_c_check_func "$LINENO" "pwrite" "ac_cv_func_pwrite"
-if test "x$ac_cv_func_pwrite" = xyes; then :
-  $as_echo "#define HAVE_PWRITE 1" >>confdefs.h
-
-else
-  case " $LIBOBJS " in
-  *" pwrite.$ac_objext "* ) ;;
-  *) LIBOBJS="$LIBOBJS pwrite.$ac_objext"
- ;;
-esac
-
-fi
-
 ac_fn_c_check_func "$LINENO" "random" "ac_cv_func_random"
 if test "x$ac_cv_func_random" = xyes; then :
   $as_echo "#define HAVE_RANDOM 1" >>confdefs.h
diff --git a/configure.ac b/configure.ac
index fc523c6aeb..c930463225 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1331,6 +1331,7 @@ AC_CHECK_HEADERS(m4_normalize([
 	sys/shm.h
 	sys/sockio.h
 	sys/tas.h
+	sys/uio.h
 	sys/un.h
 	termios.h
 	ucred.h
@@ -1660,9 +1661,14 @@ AC_CHECK_FUNCS(m4_normalize([
 	poll
 	posix_fallocate
 	ppoll
+	pread
+	preadv
 	pstat
 	pthread_is_threaded_np
+	pwrite
+	pwritev
 	readlink
+	readv
 	setproctitle
 	setproctitle_fast
 	setsid
@@ -1673,6 +1679,7 @@ AC_CHECK_FUNCS(m4_normalize([
 	sync_file_range
 	uselocale
 	wcstombs_l
+	writev
 ]))
 
 # These typically are compiler builtins, for which AC_CHECK_FUNCS fails.
@@ -1733,8 +1740,6 @@ AC_REPLACE_FUNCS(m4_normalize([
 	inet_aton
 	link
 	mkdtemp
-	pread
-	pwrite
 	random
 	srandom
 	strlcat
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index de8f838e53..582cbfa012 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -412,6 +412,9 @@
 /* Define to 1 if you have the `pread' function. */
 #undef HAVE_PREAD
 
+/* Define to 1 if you have the `preadv' function. */
+#undef HAVE_PREADV
+
 /* Define to 1 if you have the `pstat' function. */
 #undef HAVE_PSTAT
 
@@ -430,6 +433,9 @@
 /* Define to 1 if you have the `pwrite' function. */
 #undef HAVE_PWRITE
 
+/* Define to 1 if you have the `pwritev' function. */
+#undef HAVE_PWRITEV
+
 /* Define to 1 if you have the `random' function. */
 #undef HAVE_RANDOM
 
@@ -445,6 +451,9 @@
 /* Define to 1 if you have the `readlink' function. */
 #undef HAVE_READLINK
 
+/* Define to 1 if you have the `readv' function. */
+#undef HAVE_READV
+
 /* Define to 1 if you have the global variable
    'rl_completion_append_character'. */
 #undef HAVE_RL_COMPLETION_APPEND_CHARACTER
@@ -626,6 +635,9 @@
 /* Define to 1 if you have the <sys/ucred.h> header file. */
 #undef HAVE_SYS_UCRED_H
 
+/* Define to 1 if you have the <sys/uio.h> header file. */
+#undef HAVE_SYS_UIO_H
+
 /* Define to 1 if you have the <sys/un.h> header file. */
 #undef HAVE_SYS_UN_H
 
@@ -680,6 +692,9 @@
 /* Define to 1 if you have the <winldap.h> header file. */
 #undef HAVE_WINLDAP_H
 
+/* Define to 1 if you have the `writev' function. */
+#undef HAVE_WRITEV
+
 /* Define to 1 if you have the `X509_get_signature_nid' function. */
 #undef HAVE_X509_GET_SIGNATURE_NID
 
diff --git a/src/include/port.h b/src/include/port.h
index 5dfb00b07c..87b20518c3 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -431,6 +431,8 @@ extern ssize_t pg_pread(int fd, void *buf, size_t nbyte, off_t offset);
 extern ssize_t pg_pwrite(int fd, const void *buf, size_t nbyte, off_t offset);
 #endif
 
+/* For pg_pwritev() and pg_preadv(), see port/pg_uio.h. */
+
 #if !HAVE_DECL_STRLCAT
 extern size_t strlcat(char *dst, const char *src, size_t siz);
 #endif
diff --git a/src/include/port/pg_iovec.h b/src/include/port/pg_iovec.h
new file mode 100644
index 0000000000..035d54f840
--- /dev/null
+++ b/src/include/port/pg_iovec.h
@@ -0,0 +1,57 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_iovec.h
+ *	  Header for the vectored I/O functions in src/port/p{read,write}.c.
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/port/pg_iovec.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_IOVEC_H
+#define PG_IOVEC_H
+
+#include <limits.h>
+
+#ifdef HAVE_SYS_UIO_H
+#include <sys/uio.h>
+#endif
+
+/* If <sys/uio.h> is missing, define our own POSIX-compatible iovec struct. */
+#ifndef HAVE_SYS_UIO_H
+struct iovec
+{
+	void	   *iov_base;
+	size_t		iov_len;
+};
+#endif
+
+/*
+ * If <limits.h> didn't define IOV_MAX, define our own.  POSIX requires at
+ * least 16.
+ */
+#ifndef IOV_MAX
+#define IOV_MAX 16
+#endif
+
+/* Define a reasonable maximum that is safe to use on the stack. */
+#define PG_IOV_MAX Min(IOV_MAX, 32)
+
+#ifdef HAVE_PREADV
+#define pg_preadv preadv
+#else
+extern ssize_t pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset);
+#endif
+
+#ifdef HAVE_PWRITEV
+#define pg_pwritev pwritev
+#else
+extern ssize_t pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset);
+#endif
+
+extern ssize_t pg_pwritev_retry(int fd, const struct iovec *iov, int iovcnt,
+								off_t offset);
+
+#endif							/* PG_IOVEC_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index e41b005c4f..bc4923ce84 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -53,6 +53,8 @@ OBJS = \
 	pgstrcasecmp.o \
 	pgstrsignal.o \
 	pqsignal.o \
+	pread.o \
+	pwrite.o \
 	qsort.o \
 	qsort_arg.o \
 	quotes.o \
diff --git a/src/port/pread.c b/src/port/pread.c
index e7fecebe5f..fe0c44434d 100644
--- a/src/port/pread.c
+++ b/src/port/pread.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pread.c
- *	  Implementation of pread(2) for platforms that lack one.
+ *	  Implementation of pread[v](2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  *
@@ -9,7 +9,8 @@
  *	  src/port/pread.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pread().
+ * the POSIX function, so we use the name pg_pread().  Likewise for the
+ * iovec version.
  *
  *-------------------------------------------------------------------------
  */
@@ -23,6 +24,9 @@
 #include <unistd.h>
 #endif
 
+#include "port/pg_iovec.h"
+
+#ifndef HAVE_PREAD
 ssize_t
 pg_pread(int fd, void *buf, size_t size, off_t offset)
 {
@@ -56,3 +60,38 @@ pg_pread(int fd, void *buf, size_t size, off_t offset)
 	return read(fd, buf, size);
 #endif
 }
+#endif
+
+#ifndef HAVE_PREADV
+ssize_t
+pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_READV
+	if (iovcnt == 1)
+		return pg_pread(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return readv(fd, iov, iovcnt);
+#else
+	ssize_t 	sum = 0;
+	ssize_t 	part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pread(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
+#endif
diff --git a/src/port/pwrite.c b/src/port/pwrite.c
index 2e0c154462..d7ecbf362c 100644
--- a/src/port/pwrite.c
+++ b/src/port/pwrite.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pwrite.c
- *	  Implementation of pwrite(2) for platforms that lack one.
+ *	  Implementation of pwrite[v](2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
  *
@@ -9,7 +9,8 @@
  *	  src/port/pwrite.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pwrite().
+ * the POSIX function, so we use the name pg_pwrite().  Likewise for the
+ * iovec version.
  *
  *-------------------------------------------------------------------------
  */
@@ -23,6 +24,9 @@
 #include <unistd.h>
 #endif
 
+#include "port/pg_iovec.h"
+
+#ifndef HAVE_PWRITE
 ssize_t
 pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 {
@@ -53,3 +57,104 @@ pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 	return write(fd, buf, size);
 #endif
 }
+#endif
+
+#ifndef HAVE_PWRITEV
+ssize_t
+pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_WRITEV
+	if (iovcnt == 1)
+		return pg_pwrite(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return writev(fd, iov, iovcnt);
+#else
+	ssize_t 	sum = 0;
+	ssize_t 	part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pwrite(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
+#endif
+
+/*
+ * A wrapper for pg_pwritev() that retries on partial write.
+ */
+ssize_t
+pg_pwritev_retry(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+	struct iovec iov_copy[PG_IOV_MAX];
+	ssize_t		goal = 0;
+	ssize_t		sum = 0;
+	ssize_t		part;
+
+	/* We'd better have space to make a copy, in case we need to retry. */
+	if (iovcnt > PG_IOV_MAX)
+	{
+		errno = EINVAL;
+		return -1;
+	}
+
+	/* How much are we trying to write? */
+	for (int i = 0; i < iovcnt; ++i)
+		goal += iov[i].iov_len;
+
+	for (;;)
+	{
+		/* Write as much as we can. */
+		part = pg_pwritev(fd, iov, iovcnt, offset);
+		if (part < 0)
+			return -1;
+
+#ifdef SIMULATE_SHORT_WRITE
+		part = Min(part, 4096);
+#endif
+
+		/* Entirely done yet? */
+		sum += part;
+		if (sum == goal)
+			break;
+
+		/* Step over the part of the file that is done. */
+		Assert(sum < goal);
+		Assert(iovcnt > 0);
+		offset += part;
+
+		/* Step over iovecs that are done. */
+		while (iov->iov_len <= part)
+		{
+			part -= iov->iov_len;
+			++iov;
+			--iovcnt;
+			Assert(iovcnt > 0);
+		}
+
+		/* We need a temporary copy to scribble on. */
+		memmove(iov_copy, iov, sizeof(*iov) * iovcnt);
+		iov = iov_copy;
+
+		/* The first remaining iovec might be partially done.  Adjust it. */
+		Assert(iovcnt > 0);
+		Assert(iov_copy[0].iov_len > part);
+		iov_copy[0].iov_base = (char *) iov_copy[0].iov_base + part;
+		iov_copy[0].iov_len -= part;
+	}
+
+	return sum;
+}
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 22d6abd367..832a363282 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -329,17 +329,20 @@ sub GenerateFiles
 		HAVE_PPC_LWARX_MUTEX_HINT   => undef,
 		HAVE_PPOLL                  => undef,
 		HAVE_PREAD                  => undef,
+		HAVE_PREADV                 => undef,
 		HAVE_PSTAT                  => undef,
 		HAVE_PS_STRINGS             => undef,
 		HAVE_PTHREAD                => undef,
 		HAVE_PTHREAD_IS_THREADED_NP => undef,
 		HAVE_PTHREAD_PRIO_INHERIT   => undef,
 		HAVE_PWRITE                 => undef,
+		HAVE_PWRITEV                => undef,
 		HAVE_RANDOM                 => undef,
 		HAVE_READLINE_H             => undef,
 		HAVE_READLINE_HISTORY_H     => undef,
 		HAVE_READLINE_READLINE_H    => undef,
 		HAVE_READLINK               => undef,
+		HAVE_READV                  => undef,
 		HAVE_RL_COMPLETION_APPEND_CHARACTER      => undef,
 		HAVE_RL_COMPLETION_MATCHES               => undef,
 		HAVE_RL_COMPLETION_SUPPRESS_QUOTE        => undef,
@@ -400,6 +403,7 @@ sub GenerateFiles
 		HAVE_SYS_TYPES_H                         => 1,
 		HAVE_SYS_UCRED_H                         => undef,
 		HAVE_SYS_UN_H                            => undef,
+		HAVE_SYS_UIO_H                           => undef,
 		HAVE_TERMIOS_H                           => undef,
 		HAVE_TYPEOF                              => undef,
 		HAVE_UCRED_H                             => undef,
@@ -417,6 +421,7 @@ sub GenerateFiles
 		HAVE_WINLDAP_H                           => undef,
 		HAVE_WCSTOMBS_L                          => 1,
 		HAVE_WCTYPE_H                            => 1,
+		HAVE_WRITEV                              => undef,
 		HAVE_X509_GET_SIGNATURE_NID              => 1,
 		HAVE_X86_64_POPCNTQ                      => undef,
 		HAVE__BOOL                               => undef,
-- 
2.20.1

v3-0002-Use-vectored-I-O-to-zero-WAL-segments.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Use-vectored-I-O-to-zero-WAL-segments.patchDownload
From 01e2b824a2092d2bcbf5e53178ba08ef0abfe354 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sun, 20 Dec 2020 13:16:21 +1300
Subject: [PATCH v3 2/2] Use vectored I/O to zero WAL segments.

Instead of making many block-sized write() calls to fill a 16MB WAL file
with zeroes, make a smaller number of pwritev() calls if available.  The
actual number depends on the OS's IOV_MAX, but we cap it at 32 so we
write 256kB per call on typical systems.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
---
 src/backend/access/transam/xlog.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b1e5d2dbff..d7b3d6d5bd 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -48,6 +48,7 @@
 #include "pg_trace.h"
 #include "pgstat.h"
 #include "port/atomics.h"
+#include "port/pg_iovec.h"
 #include "postmaster/bgwriter.h"
 #include "postmaster/startup.h"
 #include "postmaster/walwriter.h"
@@ -3270,7 +3271,6 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 	XLogSegNo	installed_segno;
 	XLogSegNo	max_segno;
 	int			fd;
-	int			nbytes;
 	int			save_errno;
 
 	XLogFilePath(path, ThisTimeLineID, logsegno, wal_segment_size);
@@ -3317,6 +3317,9 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 	save_errno = 0;
 	if (wal_init_zero)
 	{
+		struct iovec iov[PG_IOV_MAX];
+		int			blocks;
+
 		/*
 		 * Zero-fill the file.  With this setting, we do this the hard way to
 		 * ensure that all the file space has really been allocated.  On
@@ -3326,15 +3329,28 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 		 * indirect blocks are down on disk.  Therefore, fdatasync(2) or
 		 * O_DSYNC will be sufficient to sync future writes to the log file.
 		 */
-		for (nbytes = 0; nbytes < wal_segment_size; nbytes += XLOG_BLCKSZ)
+
+		/* Prepare to write out a lot of copies of our zero buffer at once. */
+		for (int i = 0; i < lengthof(iov); ++i)
 		{
-			errno = 0;
-			if (write(fd, zbuffer.data, XLOG_BLCKSZ) != XLOG_BLCKSZ)
+			iov[i].iov_base = zbuffer.data;
+			iov[i].iov_len = XLOG_BLCKSZ;
+		}
+
+		/* Loop, writing as many blocks as we can for each system call. */
+		blocks = wal_segment_size / XLOG_BLCKSZ;
+		for (int i = 0; i < blocks;)
+		{
+			int 		iovcnt = Min(blocks - i, lengthof(iov));
+			off_t		offset = i * XLOG_BLCKSZ;
+
+			if (pg_pwritev_retry(fd, iov, iovcnt, offset) < 0)
 			{
-				/* if write didn't set errno, assume no disk space */
-				save_errno = errno ? errno : ENOSPC;
+				save_errno = errno;
 				break;
 			}
+
+			i += iovcnt;
 		}
 	}
 	else
-- 
2.20.1

#10Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#9)
Re: pg_preadv() and pg_pwritev()

On Wed, Dec 23, 2020 at 12:06 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, Dec 21, 2020 at 11:40 AM Andres Freund <andres@anarazel.de> wrote:

Can we come up with a better name than 'uio'? I find that a not exactly
meaningful name.

Ok, let's try port/pg_iovec.h.

I pushed it with that name, and a couple more cosmetic changes. I'll
keep an eye on the build farm.

#11Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#10)
Re: pg_preadv() and pg_pwritev()

On Mon, Jan 11, 2021 at 3:34 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Wed, Dec 23, 2020 at 12:06 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, Dec 21, 2020 at 11:40 AM Andres Freund <andres@anarazel.de> wrote:

Can we come up with a better name than 'uio'? I find that a not exactly
meaningful name.

Ok, let's try port/pg_iovec.h.

I pushed it with that name, and a couple more cosmetic changes. I'll
keep an eye on the build farm.

Since only sifaka has managed to return a result so far (nice CPU), I
had plenty of time to notice that macOS Big Sur has introduced
preadv/pwritev. They were missing on Catalina[1]https://cirrus-ci.com/task/6002157537198080.

[1]: https://cirrus-ci.com/task/6002157537198080

#12Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#11)
Re: pg_preadv() and pg_pwritev()

On Mon, Jan 11, 2021 at 3:59 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, Jan 11, 2021 at 3:34 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I pushed it with that name, and a couple more cosmetic changes. I'll
keep an eye on the build farm.

Since only sifaka has managed to return a result so far (nice CPU), I
had plenty of time to notice that macOS Big Sur has introduced
preadv/pwritev. They were missing on Catalina[1].

The rest of buildfarm was OK with it too, but I learned of a small
problem through CI testing of another patch: it's not OK for
src/port/pwrite.c to do this:

if (part > 0)
elog(ERROR, "unexpectedly wrote more than requested");

... because now when I try to use pg_pwrite() in pg_test_fsync,
Windows fails to link:

libpgport.lib(pwrite.obj) : error LNK2019: unresolved external symbol
errstart referenced in function pg_pwritev_with_retry
[C:\projects\postgresql\pg_test_fsync.vcxproj]

I'll go and replace that with an assertion.

#13Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Thomas Munro (#11)
Re: pg_preadv() and pg_pwritev()

On 11.01.2021 05:59, Thomas Munro wrote:

Since only sifaka has managed to return a result so far (nice CPU), I
had plenty of time to notice that macOS Big Sur has introduced
preadv/pwritev. They were missing on Catalina[1].

[1] https://cirrus-ci.com/task/6002157537198080

Hi, Thomas!

Indeed, pwritev is not available on macOS Catalina. So I get compiler
warnings about that:

/Users/shinderuk/src/pgwork/devel/build/../src/port/pwrite.c:117:10:
warning: 'pwritev' is only available on macOS 11.0 or newer
[-Wunguarded-availability-new]
part = pg_pwritev(fd, iov, iovcnt, offset);
^~~~~~~~~~
/Users/shinderuk/src/pgwork/devel/build/../src/include/port/pg_iovec.h:49:20:
note: expanded from macro 'pg_pwritev'
#define pg_pwritev pwritev
^~~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk/usr/include/sys/uio.h:104:9:
note: 'pwritev' has been marked as being introduced in macOS 11.0 here,
but the deployment target is macOS
10.15.0
ssize_t pwritev(int, const struct iovec *, int, off_t)
__DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0),
watchos(7.0), tvos(14.0));
^
/Users/shinderuk/src/pgwork/devel/build/../src/port/pwrite.c:117:10:
note: enclose 'pwritev' in a __builtin_available check to silence this
warning
part = pg_pwritev(fd, iov, iovcnt, offset);
^~~~~~~~~~
/Users/shinderuk/src/pgwork/devel/build/../src/include/port/pg_iovec.h:49:20:
note: expanded from macro 'pg_pwritev'
#define pg_pwritev pwritev
^~~~~~~
1 warning generated.
(... several more warnings ...)

And initdb fails:

running bootstrap script ... dyld: lazy symbol binding failed: Symbol
not found: _pwritev
Referenced from: /Users/shinderuk/src/pgwork/devel/install/bin/postgres
Expected in: /usr/lib/libSystem.B.dylib

dyld: Symbol not found: _pwritev
Referenced from: /Users/shinderuk/src/pgwork/devel/install/bin/postgres
Expected in: /usr/lib/libSystem.B.dylib

Regards.

--
Sergey Shinderuk
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#14Thomas Munro
thomas.munro@gmail.com
In reply to: Sergey Shinderuk (#13)
Re: pg_preadv() and pg_pwritev()

On Wed, Jan 13, 2021 at 10:40 PM Sergey Shinderuk
<s.shinderuk@postgrespro.ru> wrote:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk/usr/include/sys/uio.h:104:9:
note: 'pwritev' has been marked as being introduced in macOS 11.0 here,
but the deployment target is macOS
10.15.0
ssize_t pwritev(int, const struct iovec *, int, off_t)
__DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0),
watchos(7.0), tvos(14.0));
^

Hrm... So why did "configure" think you have pwritev, then? It seems
like you must have been using different compilers or options at
configure time and compile time, no?

#15Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Thomas Munro (#14)
Re: pg_preadv() and pg_pwritev()

On 13.01.2021 12:56, Thomas Munro wrote:

On Wed, Jan 13, 2021 at 10:40 PM Sergey Shinderuk
<s.shinderuk@postgrespro.ru> wrote:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk/usr/include/sys/uio.h:104:9:
note: 'pwritev' has been marked as being introduced in macOS 11.0 here,
but the deployment target is macOS
10.15.0
ssize_t pwritev(int, const struct iovec *, int, off_t)
__DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0),
watchos(7.0), tvos(14.0));
^

Hrm... So why did "configure" think you have pwritev, then? It seems
like you must have been using different compilers or options at
configure time and compile time, no?

No, i've just rerun configure from clean checkout without any options.
It does think that pwritev is available. I'll try to figure this out
later and come back to you. Thanks.

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey Shinderuk (#15)
Re: pg_preadv() and pg_pwritev()

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

On 13.01.2021 12:56, Thomas Munro wrote:

On Wed, Jan 13, 2021 at 10:40 PM Sergey Shinderuk
<s.shinderuk@postgrespro.ru> wrote:

note: 'pwritev' has been marked as being introduced in macOS 11.0 here,
but the deployment target is macOS 10.15.0

Hrm... So why did "configure" think you have pwritev, then? It seems
like you must have been using different compilers or options at
configure time and compile time, no?

No, i've just rerun configure from clean checkout without any options.
It does think that pwritev is available. I'll try to figure this out
later and come back to you. Thanks.

The symptoms sound consistent with using bleeding-edge Xcode on a
Catalina machine ... please report exact OS and Xcode versions.

I have a different complaint, using Big Sur and Xcode 12.3:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: libpgport.a(pread.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: libpgport_shlib.a(pread_shlib.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: libpgport_srv.a(pread_srv.o) has no symbols

Looks like we need to be more careful about not including pread.c
in the build unless it actually has code to contribute.

regards, tom lane

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#16)
Re: pg_preadv() and pg_pwritev()

I wrote:

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

note: 'pwritev' has been marked as being introduced in macOS 11.0 here,
but the deployment target is macOS 10.15.0

The symptoms sound consistent with using bleeding-edge Xcode on a
Catalina machine ... please report exact OS and Xcode versions.

I can reproduce these warnings on Big Sur + Xcode 12.3 by doing

export MACOSX_DEPLOYMENT_TARGET=10.15

before building; however the executable runs anyway, which I guess
is unsurprising. AFAICS from config.log, configure has no idea
that anything is wrong.

(BTW, at least the rather-old version of ccache I'm using does not
seem to realize that that environment variable is significant;
I had to clear ~/.ccache to get consistent results.)

We've had issues before with weird results from Xcode versions
newer than the underlying OS. In the past we've been able to
work around that, but I'm not sure that I see a way here.

regards, tom lane

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#17)
Re: pg_preadv() and pg_pwritev()

Hmmm ... I can further report that on Catalina + Xcode 12.0,
everything seems fine. configure correctly detects that preadv
and pwritev aren't there:

configure:15161: checking for preadv
configure:15161: ccache gcc -o conftest -Wall -Wmissing-prototypes -Wpointer-ar\
ith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-a\
ttribute -Wformat-security -fno-strict-aliasing -fwrapv -Wno-unused-command-lin\
e-argument -g -O2 -isysroot /Applications/Xcode.app/Contents/Developer/Platform\
s/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -isysroot /Applications/Xcod\
e.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.s\
dk conftest.c -lz -lm >&5
Undefined symbols for architecture x86_64:
"_preadv", referenced from:
_main in conftest-fca7e9.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15161: $? = 1

So I'm a little confused as to why this test is failing to fail
with (I assume) newer Xcode. Can we see the relevant part of
config.log on your machine?

regards, tom lane

#19Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#16)
1 attachment(s)
Re: pg_preadv() and pg_pwritev()

On Thu, Jan 14, 2021 at 5:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I have a different complaint, using Big Sur and Xcode 12.3:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: libpgport.a(pread.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: libpgport_shlib.a(pread_shlib.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: libpgport_srv.a(pread_srv.o) has no symbols

Looks like we need to be more careful about not including pread.c
in the build unless it actually has code to contribute.

I did it that way because it made it easy to test different
combinations of the replacements on computers that do actually have
pwrite and pwritev, just by tweaking pg_config.h. Here's an attempt
to do it with AC_REPLACE_FUNCS, which avoids creating empty .o files.
It means that to test the replacements on modern systems you have to
tweak pg_config.h and also add the relevant .o files to LIBOBJS in
src/Makefile.global, but that seems OK.

Attachments:

0001-Move-our-p-read-write-v-replacements-into-their-own-.patchtext/x-patch; charset=US-ASCII; name=0001-Move-our-p-read-write-v-replacements-into-their-own-.patchDownload
From 431634523af21b30f4e08e627bb80acb243d9e56 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Thu, 14 Jan 2021 08:19:18 +1300
Subject: [PATCH] Move our p{read,write}v replacements into their own files.

macOS's ranlib issued a warning about an empty pread.o file with the previous
arrangement, on systems new enough to require no replacement functions.  Let's
go back to using configure's "replacement" system for including the .o in the
library only if it's needed, which requires splitting out the *v() functions to
their own files with appropriate names.  We'll also have to move the
_with_retry() wrapper somewhere else, because we always want that.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
---
 configure                     |  54 ++++++++++++++++-
 configure.ac                  |   8 +--
 src/backend/storage/file/fd.c |  65 +++++++++++++++++++++
 src/include/storage/fd.h      |   5 ++
 src/port/pread.c              |  43 +-------------
 src/port/preadv.c             |  58 ++++++++++++++++++
 src/port/pwrite.c             | 107 +---------------------------------
 src/port/pwritev.c            |  58 ++++++++++++++++++
 src/tools/msvc/Mkvcbuild.pm   |   2 +-
 9 files changed, 248 insertions(+), 152 deletions(-)
 create mode 100644 src/port/preadv.c
 create mode 100644 src/port/pwritev.c

diff --git a/configure b/configure
index b917a2a1c9..8af4b99021 100755
--- a/configure
+++ b/configure
@@ -15155,7 +15155,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pread preadv pstat pthread_is_threaded_np pwrite pwritev readlink readv setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l writev
+for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pstat pthread_is_threaded_np readlink readv setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink sync_file_range uselocale wcstombs_l writev
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
@@ -15832,6 +15832,58 @@ esac
 
 fi
 
+ac_fn_c_check_func "$LINENO" "pread" "ac_cv_func_pread"
+if test "x$ac_cv_func_pread" = xyes; then :
+  $as_echo "#define HAVE_PREAD 1" >>confdefs.h
+
+else
+  case " $LIBOBJS " in
+  *" pread.$ac_objext "* ) ;;
+  *) LIBOBJS="$LIBOBJS pread.$ac_objext"
+ ;;
+esac
+
+fi
+
+ac_fn_c_check_func "$LINENO" "preadv" "ac_cv_func_preadv"
+if test "x$ac_cv_func_preadv" = xyes; then :
+  $as_echo "#define HAVE_PREADV 1" >>confdefs.h
+
+else
+  case " $LIBOBJS " in
+  *" preadv.$ac_objext "* ) ;;
+  *) LIBOBJS="$LIBOBJS preadv.$ac_objext"
+ ;;
+esac
+
+fi
+
+ac_fn_c_check_func "$LINENO" "pwrite" "ac_cv_func_pwrite"
+if test "x$ac_cv_func_pwrite" = xyes; then :
+  $as_echo "#define HAVE_PWRITE 1" >>confdefs.h
+
+else
+  case " $LIBOBJS " in
+  *" pwrite.$ac_objext "* ) ;;
+  *) LIBOBJS="$LIBOBJS pwrite.$ac_objext"
+ ;;
+esac
+
+fi
+
+ac_fn_c_check_func "$LINENO" "pwritev" "ac_cv_func_pwritev"
+if test "x$ac_cv_func_pwritev" = xyes; then :
+  $as_echo "#define HAVE_PWRITEV 1" >>confdefs.h
+
+else
+  case " $LIBOBJS " in
+  *" pwritev.$ac_objext "* ) ;;
+  *) LIBOBJS="$LIBOBJS pwritev.$ac_objext"
+ ;;
+esac
+
+fi
+
 ac_fn_c_check_func "$LINENO" "random" "ac_cv_func_random"
 if test "x$ac_cv_func_random" = xyes; then :
   $as_echo "#define HAVE_RANDOM 1" >>confdefs.h
diff --git a/configure.ac b/configure.ac
index 838d47dc22..868a94c9ba 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1661,12 +1661,8 @@ AC_CHECK_FUNCS(m4_normalize([
 	poll
 	posix_fallocate
 	ppoll
-	pread
-	preadv
 	pstat
 	pthread_is_threaded_np
-	pwrite
-	pwritev
 	readlink
 	readv
 	setproctitle
@@ -1740,6 +1736,10 @@ AC_REPLACE_FUNCS(m4_normalize([
 	inet_aton
 	link
 	mkdtemp
+	pread
+	preadv
+	pwrite
+	pwritev
 	random
 	srandom
 	strlcat
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 931ed67930..b58502837a 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -92,6 +92,7 @@
 #include "common/file_utils.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "port/pg_iovec.h"
 #include "portability/mem.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
@@ -3635,3 +3636,67 @@ data_sync_elevel(int elevel)
 {
 	return data_sync_retry ? elevel : PANIC;
 }
+
+/*
+ * A convenience wrapper for pg_pwritev() that retries on partial write.  If an
+ * error is returned, it is unspecified how much has been written.
+ */
+ssize_t
+pg_pwritev_with_retry(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+	struct iovec iov_copy[PG_IOV_MAX];
+	ssize_t		sum = 0;
+	ssize_t		part;
+
+	/* We'd better have space to make a copy, in case we need to retry. */
+	if (iovcnt > PG_IOV_MAX)
+	{
+		errno = EINVAL;
+		return -1;
+	}
+
+	for (;;)
+	{
+		/* Write as much as we can. */
+		part = pg_pwritev(fd, iov, iovcnt, offset);
+		if (part < 0)
+			return -1;
+
+#ifdef SIMULATE_SHORT_WRITE
+		part = Min(part, 4096);
+#endif
+
+		/* Count our progress. */
+		sum += part;
+		offset += part;
+
+		/* Step over iovecs that are done. */
+		while (iovcnt > 0 && iov->iov_len <= part)
+		{
+			part -= iov->iov_len;
+			++iov;
+			--iovcnt;
+		}
+
+		/* Are they all done? */
+		if (iovcnt == 0)
+		{
+			/* We don't expect the kernel to write more than requested. */
+			Assert(part == 0);
+			break;
+		}
+
+		/*
+		 * Move whatever's left to the front of our mutable copy and adjust
+		 * the leading iovec.
+		 */
+		Assert(iovcnt > 0);
+		memmove(iov_copy, iov, sizeof(*iov) * iovcnt);
+		Assert(iov->iov_len > part);
+		iov_copy[0].iov_base = (char *) iov_copy[0].iov_base + part;
+		iov_copy[0].iov_len -= part;
+		iov = iov_copy;
+	}
+
+	return sum;
+}
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index f2662a96fd..2654ad109f 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -46,6 +46,7 @@
 #include <dirent.h>
 
 
+struct iovec;
 typedef int File;
 
 
@@ -161,6 +162,10 @@ extern int	durable_unlink(const char *fname, int loglevel);
 extern int	durable_rename_excl(const char *oldfile, const char *newfile, int loglevel);
 extern void SyncDataDirectory(void);
 extern int	data_sync_elevel(int elevel);
+extern ssize_t pg_pwritev_with_retry(int fd,
+									 const struct iovec *iov,
+									 int iovcnt,
+									 off_t offset);
 
 /* Filename components */
 #define PG_TEMP_FILES_DIR "pgsql_tmp"
diff --git a/src/port/pread.c b/src/port/pread.c
index a5ae2759fa..486f07a7df 100644
--- a/src/port/pread.c
+++ b/src/port/pread.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pread.c
- *	  Implementation of pread[v](2) for platforms that lack one.
+ *	  Implementation of pread(2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2021, PostgreSQL Global Development Group
  *
@@ -9,8 +9,7 @@
  *	  src/port/pread.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pread().  Likewise for the
- * iovec version.
+ * the POSIX function, so we use the name pg_pread().
  *
  *-------------------------------------------------------------------------
  */
@@ -24,9 +23,6 @@
 #include <unistd.h>
 #endif
 
-#include "port/pg_iovec.h"
-
-#ifndef HAVE_PREAD
 ssize_t
 pg_pread(int fd, void *buf, size_t size, off_t offset)
 {
@@ -60,38 +56,3 @@ pg_pread(int fd, void *buf, size_t size, off_t offset)
 	return read(fd, buf, size);
 #endif
 }
-#endif
-
-#ifndef HAVE_PREADV
-ssize_t
-pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
-{
-#ifdef HAVE_READV
-	if (iovcnt == 1)
-		return pg_pread(fd, iov[0].iov_base, iov[0].iov_len, offset);
-	if (lseek(fd, offset, SEEK_SET) < 0)
-		return -1;
-	return readv(fd, iov, iovcnt);
-#else
-	ssize_t 	sum = 0;
-	ssize_t 	part;
-
-	for (int i = 0; i < iovcnt; ++i)
-	{
-		part = pg_pread(fd, iov[i].iov_base, iov[i].iov_len, offset);
-		if (part < 0)
-		{
-			if (i == 0)
-				return -1;
-			else
-				return sum;
-		}
-		sum += part;
-		offset += part;
-		if (part < iov[i].iov_len)
-			return sum;
-	}
-	return sum;
-#endif
-}
-#endif
diff --git a/src/port/preadv.c b/src/port/preadv.c
new file mode 100644
index 0000000000..29c808cd0c
--- /dev/null
+++ b/src/port/preadv.c
@@ -0,0 +1,58 @@
+/*-------------------------------------------------------------------------
+ *
+ * preadv.c
+ *	  Implementation of preadv(2) for platforms that lack one.
+ *
+ * Portions Copyright (c) 1996-2021, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/port/preadv.c
+ *
+ * Note that this implementation changes the current file position, unlike
+ * the POSIX-like function, so we use the name pg_preadv().
+ *
+ *-------------------------------------------------------------------------
+ */
+
+
+#include "postgres.h"
+
+#ifdef WIN32
+#include <windows.h>
+#else
+#include <unistd.h>
+#endif
+
+#include "port/pg_iovec.h"
+
+ssize_t
+pg_preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_READV
+	if (iovcnt == 1)
+		return pg_pread(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return readv(fd, iov, iovcnt);
+#else
+	ssize_t 	sum = 0;
+	ssize_t 	part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pread(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
diff --git a/src/port/pwrite.c b/src/port/pwrite.c
index a98343ec05..282b27115e 100644
--- a/src/port/pwrite.c
+++ b/src/port/pwrite.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * pwrite.c
- *	  Implementation of pwrite[v](2) for platforms that lack one.
+ *	  Implementation of pwrite(2) for platforms that lack one.
  *
  * Portions Copyright (c) 1996-2021, PostgreSQL Global Development Group
  *
@@ -9,8 +9,7 @@
  *	  src/port/pwrite.c
  *
  * Note that this implementation changes the current file position, unlike
- * the POSIX function, so we use the name pg_pwrite().  Likewise for the
- * iovec version.
+ * the POSIX function, so we use the name pg_pwrite().
  *
  *-------------------------------------------------------------------------
  */
@@ -24,9 +23,6 @@
 #include <unistd.h>
 #endif
 
-#include "port/pg_iovec.h"
-
-#ifndef HAVE_PWRITE
 ssize_t
 pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 {
@@ -57,102 +53,3 @@ pg_pwrite(int fd, const void *buf, size_t size, off_t offset)
 	return write(fd, buf, size);
 #endif
 }
-#endif
-
-#ifndef HAVE_PWRITEV
-ssize_t
-pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
-{
-#ifdef HAVE_WRITEV
-	if (iovcnt == 1)
-		return pg_pwrite(fd, iov[0].iov_base, iov[0].iov_len, offset);
-	if (lseek(fd, offset, SEEK_SET) < 0)
-		return -1;
-	return writev(fd, iov, iovcnt);
-#else
-	ssize_t		sum = 0;
-	ssize_t		part;
-
-	for (int i = 0; i < iovcnt; ++i)
-	{
-		part = pg_pwrite(fd, iov[i].iov_base, iov[i].iov_len, offset);
-		if (part < 0)
-		{
-			if (i == 0)
-				return -1;
-			else
-				return sum;
-		}
-		sum += part;
-		offset += part;
-		if (part < iov[i].iov_len)
-			return sum;
-	}
-	return sum;
-#endif
-}
-#endif
-
-/*
- * A convenience wrapper for pg_pwritev() that retries on partial write.  If an
- * error is returned, it is unspecified how much has been written.
- */
-ssize_t
-pg_pwritev_with_retry(int fd, const struct iovec *iov, int iovcnt, off_t offset)
-{
-	struct iovec iov_copy[PG_IOV_MAX];
-	ssize_t		sum = 0;
-	ssize_t		part;
-
-	/* We'd better have space to make a copy, in case we need to retry. */
-	if (iovcnt > PG_IOV_MAX)
-	{
-		errno = EINVAL;
-		return -1;
-	}
-
-	for (;;)
-	{
-		/* Write as much as we can. */
-		part = pg_pwritev(fd, iov, iovcnt, offset);
-		if (part < 0)
-			return -1;
-
-#ifdef SIMULATE_SHORT_WRITE
-		part = Min(part, 4096);
-#endif
-
-		/* Count our progress. */
-		sum += part;
-		offset += part;
-
-		/* Step over iovecs that are done. */
-		while (iovcnt > 0 && iov->iov_len <= part)
-		{
-			part -= iov->iov_len;
-			++iov;
-			--iovcnt;
-		}
-
-		/* Are they all done? */
-		if (iovcnt == 0)
-		{
-			/* We don't expect the kernel to write more than requested. */
-			Assert(part == 0);
-			break;
-		}
-
-		/*
-		 * Move whatever's left to the front of our mutable copy and adjust
-		 * the leading iovec.
-		 */
-		Assert(iovcnt > 0);
-		memmove(iov_copy, iov, sizeof(*iov) * iovcnt);
-		Assert(iov->iov_len > part);
-		iov_copy[0].iov_base = (char *) iov_copy[0].iov_base + part;
-		iov_copy[0].iov_len -= part;
-		iov = iov_copy;
-	}
-
-	return sum;
-}
diff --git a/src/port/pwritev.c b/src/port/pwritev.c
new file mode 100644
index 0000000000..2e8ef7e378
--- /dev/null
+++ b/src/port/pwritev.c
@@ -0,0 +1,58 @@
+/*-------------------------------------------------------------------------
+ *
+ * pwritev.c
+ *	  Implementation of pwritev(2) for platforms that lack one.
+ *
+ * Portions Copyright (c) 1996-2021, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/port/pwritev.c
+ *
+ * Note that this implementation changes the current file position, unlike
+ * the POSIX-like function, so we use the name pg_pwritev().
+ *
+ *-------------------------------------------------------------------------
+ */
+
+
+#include "postgres.h"
+
+#ifdef WIN32
+#include <windows.h>
+#else
+#include <unistd.h>
+#endif
+
+#include "port/pg_iovec.h"
+
+ssize_t
+pg_pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
+{
+#ifdef HAVE_WRITEV
+	if (iovcnt == 1)
+		return pg_pwrite(fd, iov[0].iov_base, iov[0].iov_len, offset);
+	if (lseek(fd, offset, SEEK_SET) < 0)
+		return -1;
+	return writev(fd, iov, iovcnt);
+#else
+	ssize_t		sum = 0;
+	ssize_t		part;
+
+	for (int i = 0; i < iovcnt; ++i)
+	{
+		part = pg_pwrite(fd, iov[i].iov_base, iov[i].iov_len, offset);
+		if (part < 0)
+		{
+			if (i == 0)
+				return -1;
+			else
+				return sum;
+		}
+		sum += part;
+		offset += part;
+		if (part < iov[i].iov_len)
+			return sum;
+	}
+	return sum;
+#endif
+}
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 7f014a12c9..535b67e668 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -99,7 +99,7 @@ sub mkvcbuild
 	  srandom.c getaddrinfo.c gettimeofday.c inet_net_ntop.c kill.c open.c
 	  erand48.c snprintf.c strlcat.c strlcpy.c dirmod.c noblock.c path.c
 	  dirent.c dlopen.c getopt.c getopt_long.c link.c
-	  pread.c pwrite.c pg_bitutils.c
+	  pread.c preadv.c pwrite.c pwritev.c pg_bitutils.c
 	  pg_strong_random.c pgcheckdir.c pgmkdirp.c pgsleep.c pgstrcasecmp.c
 	  pqsignal.c mkdtemp.c qsort.c qsort_arg.c quotes.c system.c
 	  strerror.c tar.c thread.c
-- 
2.20.1

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#19)
Re: pg_preadv() and pg_pwritev()

Thomas Munro <thomas.munro@gmail.com> writes:

On Thu, Jan 14, 2021 at 5:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Looks like we need to be more careful about not including pread.c
in the build unless it actually has code to contribute.

I did it that way because it made it easy to test different
combinations of the replacements on computers that do actually have
pwrite and pwritev, just by tweaking pg_config.h. Here's an attempt
to do it with AC_REPLACE_FUNCS, which avoids creating empty .o files.
It means that to test the replacements on modern systems you have to
tweak pg_config.h and also add the relevant .o files to LIBOBJS in
src/Makefile.global, but that seems OK.

Yeah, this looks better. Two gripes, one major and one minor:

* You need to remove pread.o and pwrite.o from the hard-wired
part of the list in src/port/Makefile, else they get built
whether needed or not.

* I don't much like this in fd.h:

@@ -46,6 +46,7 @@
#include <dirent.h>

+struct iovec;
typedef int File;

because it makes it look like iovec and File are of similar
status, which they hardly are. Perhaps more like

 #include <dirent.h>
+ 
+struct iovec;			/* avoid including sys/uio.h here */

typedef int File;

I confirm clean builds on Big Sur and Catalina with this.

regards, tom lane

#21Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#20)
Re: pg_preadv() and pg_pwritev()

On Thu, Jan 14, 2021 at 9:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* You need to remove pread.o and pwrite.o from the hard-wired
part of the list in src/port/Makefile, else they get built
whether needed or not.

Right, done.

* I don't much like this in fd.h:

@@ -46,6 +46,7 @@
#include <dirent.h>

+struct iovec;
typedef int File;

because it makes it look like iovec and File are of similar
status, which they hardly are. Perhaps more like

#include <dirent.h>
+
+struct iovec;                  /* avoid including sys/uio.h here */

Done, except I wrote port/pg_iovec.h.

I confirm clean builds on Big Sur and Catalina with this.

Thanks for checking. I also checked on Windows via CI. Pushed.

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#18)
1 attachment(s)
Re: pg_preadv() and pg_pwritev()

I wrote:

So I'm a little confused as to why this test is failing to fail
with (I assume) newer Xcode. Can we see the relevant part of
config.log on your machine?

After further digging I believe I understand what's happening,
and it's a bit surprising we've not been bit by it before.
If the compiler believes (thanks to __API_AVAILABLE macros in
Apple's system headers) that a given library symbol might not
exist in the lowest macOS version it is compiling for, it will
still emit a normal call to that function ... but it also emits

.weak_reference _preadv

marking the call as a weak reference. If the linker then fails
to link that call, it doesn't throw an error, it just replaces
the call instruction with a NOP :-(. This is why configure's
test appears to succeed, since it only checks whether you can
link not whether the call would work at runtime. Apple's
assumption evidently is that you'll guard the call with a run-time
check to see if the function exists before you use it, and you
don't want your link to fail if it doesn't.

The solution to this, according to "man ld", is

-no_weak_imports
Error if any symbols are weak imports (i.e. allowed to be
unresolved (NULL) at runtime). Useful for config based
projects that assume they are built and run on the same OS
version.

I don't particularly care that Apple is looking down their nose
at people who don't want to make their builds run on multiple OS
versions, so I think we should just use this and call it good.

Attached is an untested quick hack to make that happen --- Sergey,
can you verify that this fixes configure's results on your setup?

(This is not quite committable as-is, it needs something to avoid
adding -Wl,-no_weak_imports on ancient macOS versions. But it
will do to see if the fix works on modern versions.)

regards, tom lane

Attachments:

forbid-weak-linking-on-macos.patchtext/x-diff; charset=us-ascii; name=forbid-weak-linking-on-macos.patchDownload
diff --git a/src/template/darwin b/src/template/darwin
index 32414d21a9..22d04e7ba7 100644
--- a/src/template/darwin
+++ b/src/template/darwin
@@ -17,6 +17,9 @@ if test x"$PG_SYSROOT" != x"" ; then
   fi
 fi
 
+# Disable weak linking, else configure's checks will misbehave
+LDFLAGS="-Wl,-no_weak_imports $LDFLAGS"
+
 # Extra CFLAGS for code that will go into a shared library
 CFLAGS_SL=""
 
#23Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Tom Lane (#22)
2 attachment(s)
Re: pg_preadv() and pg_pwritev()

Tom Lane wrote:

The symptoms sound consistent with using bleeding-edge Xcode on a
Catalina machine ... please report exact OS and Xcode versions.

macOS 10.15.7 (19H2)
Xcode 12.3 (12C33)
macOS SDK 11.1 (20C63)

Attached is an untested quick hack to make that happen --- Sergey,
can you verify that this fixes configure's results on your setup?

"-no_weak_imports" doesn't help.

configure:15161: checking for pwritev
configure:15161: gcc -o conftest -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -Wno-unused-command-line-argument -O2 -isysroot
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
-Wl,-no_weak_imports -isysroot
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
conftest.c -lz -lm >&5
configure:15161: $? = 0
configure:15161: result: yes

Then I get the same compiler warnings about pwritev and an unrelated
link error:

ld: weak import of symbol '___darwin_check_fd_set_overflow' not
supported because of option: -no_weak_imports for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
make[2]: *** [postgres] Error 1
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2

Please see the logs attached.

Attachments:

config.logtext/plain; charset=UTF-8; name=config.log; x-mac-creator=0; x-mac-type=0Download
make.logtext/plain; charset=UTF-8; name=make.log; x-mac-creator=0; x-mac-type=0Download
#24Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Sergey Shinderuk (#23)
Re: pg_preadv() and pg_pwritev()

The symptoms sound consistent with using bleeding-edge Xcode on a
Catalina machine ... please report exact OS and Xcode versions.

macOS 10.15.7 (19H2)
Xcode 12.3 (12C33)
macOS SDK 11.1 (20C63)

Everything is fine if I run "configure" with
PG_SYSROOT=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

I noticed that "cc" invoked from command line uses:
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

But "xcodebuild -version -sdk macosx Path" invoked by "configure" when
PG_SYSROOT is not provided gives:
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk

Now I'm confused about different SDK versions and locations used by
Xcode and CommandLineTools :)

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey Shinderuk (#24)
Re: pg_preadv() and pg_pwritev()

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

I noticed that "cc" invoked from command line uses:
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

Hm, how did you determine that exactly?

regards, tom lane

#26Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Tom Lane (#25)
Re: pg_preadv() and pg_pwritev()

On 14.01.2021 18:42, Tom Lane wrote:

I noticed that "cc" invoked from command line uses:
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

Hm, how did you determine that exactly?

% echo 'int main(void){}' >tmp.c
% cc -v tmp.c
Apple clang version 12.0.0 (clang-1200.0.32.28)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

"/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang"
-cc1 -triple x86_64-apple-macosx10.15.0 -Wdeprecated-objc-isa-usage
-Werror=deprecated-objc-isa-usage -Werror=implicit-function-declaration
-emit-obj -mrelax-all -disable-free -disable-llvm-verifier
-discard-value-names -main-file-name tmp.c -mrelocation-model pic
-pic-level 2 -mthread-model posix -mframe-pointer=all -fno-strict-return
-masm-verbose -munwind-tables -target-sdk-version=10.15.6
-fcompatibility-qualified-id-block-type-checking -target-cpu penryn
-dwarf-column-info -debugger-tuning=lldb -target-linker-version 609.8 -v
-resource-dir
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk
-I/usr/local/include -internal-isystem
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/local/include
-internal-isystem
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/include
-internal-externc-isystem
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include
-internal-externc-isystem
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
-Wno-reorder-init-list -Wno-implicit-int-float-conversion
-Wno-c99-designator -Wno-final-dtor-non-final-class -Wno-extra-semi-stmt
-Wno-misleading-indentation -Wno-quoted-include-in-framework-header
-Wno-implicit-fallthrough -Wno-enum-enum-conversion
-Wno-enum-float-conversion -fdebug-compilation-dir /Users/shinderuk
-ferror-limit 19 -fmessage-length 238 -stack-protector 1 -fstack-check
-mdarwin-stkchk-strong-link -fblocks -fencode-extended-block-signature
-fregister-global-dtors-with-atexit -fgnuc-version=4.2.1
-fobjc-runtime=macosx-10.15.0 -fmax-type-align=16
-fdiagnostics-show-option -fcolor-diagnostics -o
/var/folders/8x/jvqv7hyd5h98m7tz2zm9r0yc0000gn/T/tmp-91fb5e.o -x c tmp.c
clang -cc1 version 12.0.0 (clang-1200.0.32.28) default target
x86_64-apple-darwin19.6.0
ignoring nonexistent directory
"/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/local/include"
ignoring nonexistent directory
"/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/Library/Frameworks"
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/include
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include

/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks
(framework directory)
End of search list.

"/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld"
-demangle -lto_library
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/libLTO.dylib
-no_deduplicate -dynamic -arch x86_64 -platform_version macos 10.15.0
10.15.6 -syslibroot
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -o a.out
-L/usr/local/lib
/var/folders/8x/jvqv7hyd5h98m7tz2zm9r0yc0000gn/T/tmp-91fb5e.o -lSystem
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/lib/darwin/libclang_rt.osx.a

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey Shinderuk (#26)
Re: pg_preadv() and pg_pwritev()

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

On 14.01.2021 18:42, Tom Lane wrote:

I noticed that "cc" invoked from command line uses:
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

Hm, how did you determine that exactly?

% cc -v tmp.c
...
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

Okay, interesting. On my Catalina machine, I see

-isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk

which is also a 10.15 SDK, since I haven't upgraded Xcode past 12.0.
I wonder if that would change if I did upgrade (but I don't plan to
risk it, since this is my only remaining Catalina install).

After considerable playing around, I'm guessing that the reason
-no_weak_imports doesn't help is that it rejects calls that are
marked as weak references on the *calling* side. Since AC_CHECK_FUNCS
doesn't bother to #include the relevant header file, the compiler
doesn't know that preadv() ought to be marked as a weak reference.
Then, when the test program gets linked against the stub libc that's
provided by the SDK, there is a version of preadv() there so no link
failure occurs. (There are way more moving parts in this weak-reference
thing than I'd realized.)

It seems like the more productive approach would be to try to identify
the right sysroot to use. I wonder if there is some less messy way
to find out the compiler's default sysroot than to scrape it out of
-v output.

Another thing I've been realizing while poking at this is that we
might not need to set -isysroot explicitly at all, which would then
lead to the compiler using its default sysroot automatically.
In some experimentation, it seems like what we need PG_SYSROOT for
is just for configure to be able to find tclConfig.sh and the Perl
header files. So at this point I'm tempted to try ripping that
out altogether. If you remove the lines in src/template/darwin
that inject PG_SYSROOT into CPPFLAGS and LDFLAGS, do things
work for you?

regards, tom lane

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#27)
Re: pg_preadv() and pg_pwritev()

I wrote:

It seems like the more productive approach would be to try to identify
the right sysroot to use. I wonder if there is some less messy way
to find out the compiler's default sysroot than to scrape it out of
-v output.

This is, of course, not terribly well documented by Apple. But
Mr. Google suggests that "xcrun --show-sdk-path" might serve.
What does that print for you?

regards, tom lane

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#28)
Re: pg_preadv() and pg_pwritev()

I borrowed my wife's Mac, which is still on Catalina and up to now
never had Xcode on it, and found some very interesting things.

Step 1: download/install Xcode 12.3, open it, agree to license,
wait for it to finish "installing components".

At this point, /Library/Developer/CommandLineTools doesn't exist,
and we have the following outputs from various probe commands:

% xcrun --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
% xcrun --sdk macosx --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
% xcodebuild -version -sdk macosx Path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk

Also, cc -v reports
-isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk

Unsurprisingly, Xcode 12.3 itself only contains

% ls -l /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs
total 0
drwxr-xr-x 5 root wheel 160 Nov 30 07:27 DriverKit20.2.sdk
drwxr-xr-x 7 root wheel 224 Nov 30 07:27 MacOSX.sdk
lrwxr-xr-x 1 root wheel 10 Jan 14 15:57 MacOSX11.1.sdk -> MacOSX.sdk

Step 2: install command line tools (I used "xcode-select --install"
to fire this off, rather than the Xcode menu item).

Now I have

% ls -l /Library/Developer/CommandLineTools/SDKs
total 0
lrwxr-xr-x 1 root wheel 14 Jan 14 16:42 MacOSX.sdk -> MacOSX11.1.sdk
drwxr-xr-x 8 root wheel 256 Jul 9 2020 MacOSX10.15.sdk
drwxr-xr-x 7 root wheel 224 Nov 30 07:33 MacOSX11.1.sdk

which is pretty interesting in itself, because the same directory on
my recently-updated-to-Big-Sur Macs does NOT have the 11.1 SDK.
I wonder what determines which versions get installed here.

More interesting yet:

% xcrun --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk
% xcrun --sdk macosx --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
% xcodebuild -version -sdk macosx Path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk

and cc -v reports
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

So apparently, "xcrun --show-sdk-path" (without any -sdk option)
is the most authoritative guide to the compiler's default sysroot.

However, googling turns up various people reporting that "xcrun
--show-sdk-path" returns an empty string for them, and our last
major investigation into this [1]/messages/by-id/20840.1537850987@sss.pgh.pa.us found that there are some system
states where the compiler appears to have no default sysroot,
which I bet is the same thing. I do not at this point have a recipe
to reproduce such a state, but we'd be fools to imagine it's no
longer possible. My guess about it is that Apple's processes for
updating the default sysroot during system updates are just plain
buggy, with various corner cases that have ill-understood causes.

Also, after re-reading [1]/messages/by-id/20840.1537850987@sss.pgh.pa.us I am not at all excited about trying to
remove the -isysroot switches from our *FLAGS. What I propose to do
is keep that, but improve our mechanism for choosing a default value
for PG_SYSROOT. It looks like first trying "xcrun --show-sdk-path",
and falling back to "xcodebuild -version -sdk macosx Path" if that
doesn't yield a valid path, is more likely to give a working build
than relying entirely on xcodebuild. Maybe there's a case for trying
"xcrun --sdk macosx --show-sdk-path" in between; in my tests that
seemed noticeably faster than invoking xcodebuild, and I've not yet
seen a case where it gave a different answer.

Thoughts?

regards, tom lane

[1]: /messages/by-id/20840.1537850987@sss.pgh.pa.us

#30Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Tom Lane (#29)
Re: pg_preadv() and pg_pwritev()

On 15.01.2021 01:13, Tom Lane wrote:

I borrowed my wife's Mac, which is still on Catalina and up to now
never had Xcode on it, and found some very interesting things.

Step 1: download/install Xcode 12.3, open it, agree to license,
wait for it to finish "installing components".

At this point, /Library/Developer/CommandLineTools doesn't exist,
and we have the following outputs from various probe commands:

% xcrun --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
% xcrun --sdk macosx --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
% xcodebuild -version -sdk macosx Path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk

Also, cc -v reports
-isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk

Unsurprisingly, Xcode 12.3 itself only contains

% ls -l /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs
total 0
drwxr-xr-x 5 root wheel 160 Nov 30 07:27 DriverKit20.2.sdk
drwxr-xr-x 7 root wheel 224 Nov 30 07:27 MacOSX.sdk
lrwxr-xr-x 1 root wheel 10 Jan 14 15:57 MacOSX11.1.sdk -> MacOSX.sdk

Step 2: install command line tools (I used "xcode-select --install"
to fire this off, rather than the Xcode menu item).

Now I have

% ls -l /Library/Developer/CommandLineTools/SDKs
total 0
lrwxr-xr-x 1 root wheel 14 Jan 14 16:42 MacOSX.sdk -> MacOSX11.1.sdk
drwxr-xr-x 8 root wheel 256 Jul 9 2020 MacOSX10.15.sdk
drwxr-xr-x 7 root wheel 224 Nov 30 07:33 MacOSX11.1.sdk

which is pretty interesting in itself, because the same directory on
my recently-updated-to-Big-Sur Macs does NOT have the 11.1 SDK.
I wonder what determines which versions get installed here.

More interesting yet:

% xcrun --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk
% xcrun --sdk macosx --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
% xcodebuild -version -sdk macosx Path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk

and cc -v reports
-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

So apparently, "xcrun --show-sdk-path" (without any -sdk option)
is the most authoritative guide to the compiler's default sysroot.

However, googling turns up various people reporting that "xcrun
--show-sdk-path" returns an empty string for them, and our last
major investigation into this [1] found that there are some system
states where the compiler appears to have no default sysroot,
which I bet is the same thing. I do not at this point have a recipe
to reproduce such a state, but we'd be fools to imagine it's no
longer possible. My guess about it is that Apple's processes for
updating the default sysroot during system updates are just plain
buggy, with various corner cases that have ill-understood causes.

Also, after re-reading [1] I am not at all excited about trying to
remove the -isysroot switches from our *FLAGS. What I propose to do
is keep that, but improve our mechanism for choosing a default value
for PG_SYSROOT. It looks like first trying "xcrun --show-sdk-path",
and falling back to "xcodebuild -version -sdk macosx Path" if that
doesn't yield a valid path, is more likely to give a working build
than relying entirely on xcodebuild. Maybe there's a case for trying
"xcrun --sdk macosx --show-sdk-path" in between; in my tests that
seemed noticeably faster than invoking xcodebuild, and I've not yet
seen a case where it gave a different answer.

Thoughts?

regards, tom lane

[1] /messages/by-id/20840.1537850987@sss.pgh.pa.us

Thanks for thorough investigation and sorry for the late reply.

I spent quite some time trying to understand / reverse engineer the
logic behind xcrun's default SDK selection. Apparently, "man xcrun" is
not accurate saying:

The SDK which will be searched defaults to the most recent
available...

I didn't find anything really useful or helpful.
"/Library/Developer/CommandLineTools" is hardcoded into
"libxcrun.dylib". On my machine xcrun scans

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs

and

/Library/Developer/CommandLineTools/SDKs

in that order, and loads "SDKSettings.plist" from each subdirectory. I
looked into plists, but couldn't find anything special about
"MacOSX10.15.sdk".

Okay, here is what I have:

% ls -l
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs
total 0
drwxr-xr-x 5 root wheel 160 Nov 30 15:27 DriverKit20.2.sdk
drwxr-xr-x 7 root wheel 224 Nov 30 15:27 MacOSX.sdk
lrwxr-xr-x 1 root wheel 10 Dec 17 14:25 MacOSX11.1.sdk -> MacOSX.sdk

% ls -l /Library/Developer/CommandLineTools/SDKs
total 0
lrwxr-xr-x 1 root wheel 14 Nov 17 02:21 MacOSX.sdk -> MacOSX11.0.sdk
drwxr-xr-x 8 root wheel 256 Nov 17 02:22 MacOSX10.15.sdk
drwxr-xr-x 7 root wheel 224 Oct 19 23:39 MacOSX11.0.sdk

% xcrun --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

% xcrun --sdk macosx --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk

Oh, that's weird! Nevertheless I like you suggestion to call "xcrun"
from "configure".

Adding "--verbose" doesn't really explain anything, but just in case.

% xcrun --verbose --no-cache --find cc
xcrun: note: PATH =
'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin'
xcrun: note: SDKROOT =
'/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk'
xcrun: note: TOOLCHAINS = ''
xcrun: note: DEVELOPER_DIR = '/Applications/Xcode.app/Contents/Developer'
xcrun: note: XCODE_DEVELOPER_USR_PATH = ''
xcrun: note: xcrun_db =
'/var/folders/8x/jvqv7hyd5h98m7tz2zm9r0yc0000gn/T/xcrun_db'
xcrun: note: xcrun via cc (xcrun)
xcrun: note: database key is:
cc|/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk||/Applications/Xcode.app/Contents/Developer|
xcrun: note: looking up with
'/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -find cc 2> /dev/null'
xcrun: note: lookup resolved with 'xcodebuild -find' to
'/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc'
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc

% xcrun --verbose --no-cache --sdk macosx --find cc
xcrun: note: looking up SDK with
'/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -version Path'
xcrun: note: PATH =
'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin'
xcrun: note: SDKROOT = 'macosx'
xcrun: note: TOOLCHAINS = ''
xcrun: note: DEVELOPER_DIR = '/Applications/Xcode.app/Contents/Developer'
xcrun: note: XCODE_DEVELOPER_USR_PATH = ''
xcrun: note: xcrun_db =
'/var/folders/8x/jvqv7hyd5h98m7tz2zm9r0yc0000gn/T/xcrun_db'
xcrun: note: lookup resolved to:
'/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk'
xcrun: note: looking up SDK with
'/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
-version PlatformPath'
xcrun: note: PATH =
'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin'
xcrun: note: SDKROOT =
'/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk'
xcrun: note: TOOLCHAINS = ''
xcrun: note: DEVELOPER_DIR = '/Applications/Xcode.app/Contents/Developer'
xcrun: note: XCODE_DEVELOPER_USR_PATH = ''
xcrun: note: xcrun_db =
'/var/folders/8x/jvqv7hyd5h98m7tz2zm9r0yc0000gn/T/xcrun_db'
xcrun: note: lookup resolved to:
'/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform'
xcrun: note: PATH =
'/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin'
xcrun: note: SDKROOT =
'/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk'
xcrun: note: TOOLCHAINS = ''
xcrun: note: DEVELOPER_DIR = '/Applications/Xcode.app/Contents/Developer'
xcrun: note: XCODE_DEVELOPER_USR_PATH = ''
xcrun: note: xcrun_db =
'/var/folders/8x/jvqv7hyd5h98m7tz2zm9r0yc0000gn/T/xcrun_db'
xcrun: note: xcrun via cc (xcrun)
xcrun: note: database key is:
cc|/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk||/Applications/Xcode.app/Contents/Developer|
xcrun: note: looking up with
'/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
-find cc 2> /dev/null'
xcrun: note: lookup resolved with 'xcodebuild -find' to
'/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc'
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc

#31Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Tom Lane (#27)
Re: pg_preadv() and pg_pwritev()

On 14.01.2021 21:05, Tom Lane wrote:

After considerable playing around, I'm guessing that the reason
-no_weak_imports doesn't help is that it rejects calls that are
marked as weak references on the *calling* side. Since AC_CHECK_FUNCS
doesn't bother to #include the relevant header file, the compiler
doesn't know that preadv() ought to be marked as a weak reference.
Then, when the test program gets linked against the stub libc that's
provided by the SDK, there is a version of preadv() there so no link
failure occurs. (There are way more moving parts in this weak-reference
thing than I'd realized.)

Oh, that's interesting. I've just played with it a bit and it looks
exactly as you say.

Another thing I've been realizing while poking at this is that we
might not need to set -isysroot explicitly at all, which would then
lead to the compiler using its default sysroot automatically.
In some experimentation, it seems like what we need PG_SYSROOT for
is just for configure to be able to find tclConfig.sh and the Perl
header files. So at this point I'm tempted to try ripping that
out altogether. If you remove the lines in src/template/darwin
that inject PG_SYSROOT into CPPFLAGS and LDFLAGS, do things
work for you?

Yes, it works fine.

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey Shinderuk (#30)
1 attachment(s)
Re: pg_preadv() and pg_pwritev()

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

On 15.01.2021 01:13, Tom Lane wrote:

Also, after re-reading [1] I am not at all excited about trying to
remove the -isysroot switches from our *FLAGS. What I propose to do
is keep that, but improve our mechanism for choosing a default value
for PG_SYSROOT. It looks like first trying "xcrun --show-sdk-path",
and falling back to "xcodebuild -version -sdk macosx Path" if that
doesn't yield a valid path, is more likely to give a working build
than relying entirely on xcodebuild. Maybe there's a case for trying
"xcrun --sdk macosx --show-sdk-path" in between; in my tests that
seemed noticeably faster than invoking xcodebuild, and I've not yet
seen a case where it gave a different answer.

I spent quite some time trying to understand / reverse engineer the
logic behind xcrun's default SDK selection.

Yeah, I wasted a fair amount of time on that too, going so far as
to ktrace xcrun (as I gather you did too). I'm not any more
enlightened than you are about exactly how it's making the choice.

Oh, that's weird! Nevertheless I like you suggestion to call "xcrun"
from "configure".

Anyway, after re-reading the previous thread, something I like about
the current behavior is that it tends to produce a version-numbered
sysroot path, ie something ending in "MacOSX11.1.sdk" or whatever.
One of the hazards we're trying to avoid is some parts of a PG
installation being built against one SDK version while other parts are
built against another. The typical behavior of "xcrun --show-sdk-path"
seems to be to produce a path ending in "MacOSX.sdk", which defeats that.
So I think we should accept the path only if it contains a version number,
and otherwise move on to the other probe commands.

Hence, I propose the attached. This works as far as I can tell
to fix the problem you're seeing.

regards, tom lane

Attachments:

fix-sysroot-selection.patchtext/x-diff; charset=us-ascii; name=fix-sysroot-selection.patchDownload
diff --git a/src/template/darwin b/src/template/darwin
index 32414d21a9..0c890482fe 100644
--- a/src/template/darwin
+++ b/src/template/darwin
@@ -3,11 +3,26 @@
 # Note: Darwin is the original code name for macOS, also known as OS X.
 # We still use "darwin" as the port name, partly because config.guess does.
 
-# Select where system include files should be sought.
+# Select where system include files should be sought, if user didn't say.
 if test x"$PG_SYSROOT" = x"" ; then
-  PG_SYSROOT=`xcodebuild -version -sdk macosx Path 2>/dev/null`
+  # This is far more complicated than it ought to be.  We first ask
+  # "xcrun --show-sdk-path", which seems to match the default -isysroot
+  # setting of Apple's compilers.  However, that may produce no result or
+  # a result that is not version-specific (i.e., just ".../SDKs/MacOSX.sdk").
+  # Having a version-specific sysroot seems desirable, so if there are not
+  # digits in the result string, try "xcrun --sdk macosx --show-sdk-path";
+  # and if that still doesn't work, fall back to asking xcodebuild.
+  PG_SYSROOT=`xcrun --show-sdk-path 2>/dev/null`
+  if expr x"$PG_SYSROOT" : '.*[0-9]\.[0-9]' >/dev/null ; then : okay
+  else
+    PG_SYSROOT=`xcrun --sdk macosx --show-sdk-path 2>/dev/null`
+    if expr x"$PG_SYSROOT" : '.*[0-9]\.[0-9]' >/dev/null ; then : okay
+    else
+      PG_SYSROOT=`xcodebuild -version -sdk macosx Path 2>/dev/null`
+    fi
+  fi
 fi
-# Old xcodebuild versions may produce garbage, so validate the result.
+# Validate the result: if it doesn't point at a directory, ignore it.
 if test x"$PG_SYSROOT" != x"" ; then
   if test -d "$PG_SYSROOT" ; then
     CPPFLAGS="-isysroot $PG_SYSROOT $CPPFLAGS"
#33Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Tom Lane (#29)
Re: pg_preadv() and pg_pwritev()

On 15.01.2021 01:13, Tom Lane wrote:

than relying entirely on xcodebuild. Maybe there's a case for trying
"xcrun --sdk macosx --show-sdk-path" in between; in my tests that
seemed noticeably faster than invoking xcodebuild, and I've not yet
seen a case where it gave a different answer.

I see that "xcrun --sdk macosx --show-sdk-path" really calls
"/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -version Path" under the hood.

% lldb -- xcrun --no-cache --sdk macosx --show-sdk-path
(lldb) target create "xcrun"
Current executable set to 'xcrun' (x86_64).
(lldb) settings set -- target.run-args "--no-cache" "--sdk" "macosx"
"--show-sdk-path"
(lldb) settings set target.unset-env-vars SDKROOT
(lldb) b popen
Breakpoint 1: where = libsystem_c.dylib`popen, address = 0x00007fff67265607
(lldb) r
Process 26857 launched: '/usr/bin/xcrun' (x86_64)
Process 26857 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x00007fff6e313607 libsystem_c.dylib`popen
libsystem_c.dylib`popen:
-> 0x7fff6e313607 <+0>: pushq %rbp
0x7fff6e313608 <+1>: movq %rsp, %rbp
0x7fff6e31360b <+4>: pushq %r15
0x7fff6e31360d <+6>: pushq %r14
Target 0: (xcrun) stopped.
(lldb) p (char *)$arg1
(char *) $1 = 0x0000000100406960
"/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -version Path"

% sudo dtrace -n 'pid$target::popen:entry { trace(copyinstr(arg0)) }' -c
'xcrun --sdk macosx --show-sdk-path'
dtrace: description 'pid$target::popen:entry ' matched 1 probe
CPU ID FUNCTION:NAME
0 413269 popen:entry
/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -version Path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
dtrace: pid 26905 has exited

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey Shinderuk (#33)
Re: pg_preadv() and pg_pwritev()

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

I see that "xcrun --sdk macosx --show-sdk-path" really calls
"/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -version Path" under the hood.

Hmm. I found something odd on my wife's Mac: although on my other
machines, I get something like

$ xcrun --verbose --no-cache --show-sdk-path
xcrun: note: looking up SDK with '/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -version PlatformPath'
xcrun: note: PATH = '/Users/tgl/testversion/bin:/usr/local/autoconf-2.69/bin:/Users/tgl/bin:/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/Library/Apple/usr/bin:/Library/Tcl/bin:/opt/X11/bin'
xcrun: note: SDKROOT = '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk'
xcrun: note: TOOLCHAINS = ''
xcrun: note: DEVELOPER_DIR = '/Applications/Xcode.app/Contents/Developer'
xcrun: note: XCODE_DEVELOPER_USR_PATH = ''
xcrun: note: xcrun_db = '/var/folders/3p/2bnrmypd17jcqbtzw79t9blw0000gn/T/xcrun_db'
xcrun: note: lookup resolved to: '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform'
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk

on her machine there's no detail at all:

% xcrun --verbose --no-cache --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

So I'm not sure what to make of that. But I'm hesitant to assume
that xcrun is just a wrapper around xcodebuild.

regards, tom lane

#35Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Sergey Shinderuk (#33)
Re: pg_preadv() and pg_pwritev()

On 15.01.2021 04:53, Sergey Shinderuk wrote:

I see that "xcrun --sdk macosx --show-sdk-path" really calls
"/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -version Path" under the hood.

... and caches the result. xcodebuild not called without --no-cache.
So it still make sense to fall back on xcodebuild.

% sudo dtrace -n 'pid$target::popen:entry { trace(copyinstr(arg0)) }' -c
'xcrun --sdk macosx --show-sdk-path'
dtrace: description 'pid$target::popen:entry ' matched 1 probe
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
dtrace: pid 26981 has exited

#36Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Tom Lane (#34)
Re: pg_preadv() and pg_pwritev()

On 15.01.2021 05:04, Tom Lane wrote:

on her machine there's no detail at all:

% xcrun --verbose --no-cache --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

The same on my machine. I get details for --find, but not for
--show-sdk-path.

So I'm not sure what to make of that. But I'm hesitant to assume
that xcrun is just a wrapper around xcodebuild.

I agree.

#37Sergey Shinderuk
s.shinderuk@postgrespro.ru
In reply to: Tom Lane (#32)
Re: pg_preadv() and pg_pwritev()

On 15.01.2021 04:45, Tom Lane wrote:

Hence, I propose the attached. This works as far as I can tell
to fix the problem you're seeing.

Yes, it works fine. Thank you very much.

#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sergey Shinderuk (#37)
Re: pg_preadv() and pg_pwritev()

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

On 15.01.2021 04:45, Tom Lane wrote:

Hence, I propose the attached. This works as far as I can tell
to fix the problem you're seeing.

Yes, it works fine. Thank you very much.

Great. Pushed with a little more polishing.

regards, tom lane

#39James Hilliard
james.hilliard1@gmail.com
In reply to: Tom Lane (#32)
Re: pg_preadv() and pg_pwritev()

On Thu, Jan 14, 2021 at 6:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Sergey Shinderuk <s.shinderuk@postgrespro.ru> writes:

On 15.01.2021 01:13, Tom Lane wrote:

Also, after re-reading [1] I am not at all excited about trying to
remove the -isysroot switches from our *FLAGS. What I propose to do
is keep that, but improve our mechanism for choosing a default value
for PG_SYSROOT. It looks like first trying "xcrun --show-sdk-path",
and falling back to "xcodebuild -version -sdk macosx Path" if that
doesn't yield a valid path, is more likely to give a working build
than relying entirely on xcodebuild. Maybe there's a case for trying
"xcrun --sdk macosx --show-sdk-path" in between; in my tests that
seemed noticeably faster than invoking xcodebuild, and I've not yet
seen a case where it gave a different answer.

I spent quite some time trying to understand / reverse engineer the
logic behind xcrun's default SDK selection.

Yeah, I wasted a fair amount of time on that too, going so far as
to ktrace xcrun (as I gather you did too). I'm not any more
enlightened than you are about exactly how it's making the choice.

Oh, that's weird! Nevertheless I like you suggestion to call "xcrun"
from "configure".

Anyway, after re-reading the previous thread, something I like about
the current behavior is that it tends to produce a version-numbered
sysroot path, ie something ending in "MacOSX11.1.sdk" or whatever.
One of the hazards we're trying to avoid is some parts of a PG
installation being built against one SDK version while other parts are
built against another. The typical behavior of "xcrun --show-sdk-path"
seems to be to produce a path ending in "MacOSX.sdk", which defeats that.
So I think we should accept the path only if it contains a version number,
and otherwise move on to the other probe commands.

I don't think enforcing a specific naming scheme makes sense, the minimum
OSX runtime version is effectively entirely separate from the SDK version.

The pwritev issue just seems to be caused by a broken configure check,
I've fixed that here:
/messages/by-id/20210119111625.20435-1-james.hilliard1@gmail.com

Show quoted text

Hence, I propose the attached. This works as far as I can tell
to fix the problem you're seeing.

regards, tom lane