disallow big-endian on aarch64
Some recent work involving SIMD instructions on AArch64 made me wonder
whether we support $SUBJECT. For reference, AArch64 is bi-endian, but
AFAICT all current AAarch64 buildfarm machines are on macOS, Linux, or
FreeBSD, which appear to require little-endian [0]https://developer.apple.com/documentation/apple-silicon/porting-your-macos-apps-to-apple-silicon#Address-Architectural-Differences [1]https://github.com/torvalds/linux/commit/1cf89b6b [2]https://github.com/freebsd/freebsd-src/commit/ad0a6546. I know there
are efforts to support Windows on AAarch64, but that requires
little-endian, too [3]https://learn.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=msvc-170#endianness. Given the apparent convergence on little-endian,
IMHO we should require it for Postgres, too. The attached patch adds some
configure-time checks for this.
[0]: https://developer.apple.com/documentation/apple-silicon/porting-your-macos-apps-to-apple-silicon#Address-Architectural-Differences
[1]: https://github.com/torvalds/linux/commit/1cf89b6b
[2]: https://github.com/freebsd/freebsd-src/commit/ad0a6546
[3]: https://learn.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=msvc-170#endianness
--
nathan
Attachments:
v1-0001-disallow-big-endian-on-aarch64.patchtext/plain; charset=us-asciiDownload
From 0bccd2d7e777ab5bd758b5e07e1917872876660d Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Thu, 2 Oct 2025 15:00:56 -0500
Subject: [PATCH v1 1/1] disallow big endian on aarch64
---
configure | 7 +++++++
configure.ac | 7 +++++++
meson.build | 4 ++++
3 files changed, 18 insertions(+)
diff --git a/configure b/configure
index 22cd866147b..249474ac891 100755
--- a/configure
+++ b/configure
@@ -14999,6 +14999,13 @@ _ACEOF
fi
+# AArch64 is bi-endian, but we require little
+if test x"$host_cpu" = x"aarch64"; then
+ if test x"$ac_cv_c_bigendian" = x"yes"; then
+ as_fn_error $? "big endian is not supported for aarch64" "$LINENO" 5
+ fi
+fi
+
# MSVC doesn't cope well with defining restrict to __restrict, the
# spelling it understands, because it conflicts with
# __declspec(restrict). Therefore we define pg_restrict to the
diff --git a/configure.ac b/configure.ac
index e44943aa6fe..5b95a0adf5c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1685,6 +1685,13 @@ PGAC_UNION_SEMUN
AC_CHECK_TYPES(socklen_t, [], [], [#include <sys/socket.h>])
PGAC_STRUCT_SOCKADDR_SA_LEN
+# AArch64 is bi-endian, but we require little
+if test x"$host_cpu" = x"aarch64"; then
+ if test x"$ac_cv_c_bigendian" = x"yes"; then
+ AC_MSG_ERROR([big endian is not supported for aarch64])
+ fi
+fi
+
# MSVC doesn't cope well with defining restrict to __restrict, the
# spelling it understands, because it conflicts with
# __declspec(restrict). Therefore we define pg_restrict to the
diff --git a/meson.build b/meson.build
index 395416a6060..35f2580d8ef 100644
--- a/meson.build
+++ b/meson.build
@@ -1731,6 +1731,10 @@ endif
###############################################################
if host_machine.endian() == 'big'
+ # AArch64 is bi-endian, but we require little
+ if host_cpu == 'aarch64'
+ error('big endian is not supported for aarch64')
+ endif
cdata.set('WORDS_BIGENDIAN', 1)
endif
--
2.39.5 (Apple Git-154)
On 10/2/25 22:16, Nathan Bossart wrote:
Some recent work involving SIMD instructions on AArch64 made me wonder
whether we support $SUBJECT. For reference, AArch64 is bi-endian, but
AFAICT all current AAarch64 buildfarm machines are on macOS, Linux, or
FreeBSD, which appear to require little-endian [0] [1] [2]. I know there
are efforts to support Windows on AAarch64, but that requires
little-endian, too [3]. Given the apparent convergence on little-endian,
IMHO we should require it for Postgres, too. The attached patch adds some
configure-time checks for this.
I don't follow the reasoning. If there are no aarch64 platforms running
in big-endian mode (at least not supported ones), then how would you
even build Postgres in such environment?
Also, what's the benefit of disabling it? Is it about disabling
something we can't meaningfully test (even though we still support other
big-endian platforms, right?). Or does it affect the SIMD stuff somehow?
regards
--
Tomas Vondra
On Thu, Oct 02, 2025 at 10:29:39PM +0200, Tomas Vondra wrote:
I don't follow the reasoning. If there are no aarch64 platforms running
in big-endian mode (at least not supported ones), then how would you
even build Postgres in such environment?Also, what's the benefit of disabling it? Is it about disabling
something we can't meaningfully test (even though we still support other
big-endian platforms, right?). Or does it affect the SIMD stuff somehow?
The benefit is that we can safely assume little-endian in AAarch64-specific
code, and on the off-chance that someone tries to build Postgres in an
AArch64/big-endian environment, we aren't pretending to support it. I'd
expect this to affect almost nobody in practice, which is why I'm proposing
that we just disallow it completely. As you noted, we can't meaningfully
test it, anyway.
I'm not proposing that we remove big-endian support from any other
platforms besides AArch64.
--
nathan
Nathan Bossart <nathandbossart@gmail.com> writes:
On Thu, Oct 02, 2025 at 10:29:39PM +0200, Tomas Vondra wrote:
I don't follow the reasoning. If there are no aarch64 platforms running
in big-endian mode (at least not supported ones), then how would you
even build Postgres in such environment?Also, what's the benefit of disabling it? Is it about disabling
something we can't meaningfully test (even though we still support other
big-endian platforms, right?). Or does it affect the SIMD stuff somehow?
The benefit is that we can safely assume little-endian in AAarch64-specific
code, and on the off-chance that someone tries to build Postgres in an
AArch64/big-endian environment, we aren't pretending to support it.
Is that actually a meaningful benefit, ie can we remove any code
anywhere?
Given that we don't believe any OS support exists for this
combination, I'm not sure why we should expend configure cycles
on rejecting it. If anyone ever builds such a platform and tries
to run Postgres on it, either it will work or they'll have to start
writing patches. But that applies to a whole lot of not-actively-
tested configurations. I don't see why we should go out of our
way to reject this one.
regards, tom lane
On Thu, Oct 02, 2025 at 05:08:42PM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
The benefit is that we can safely assume little-endian in AAarch64-specific
code, and on the off-chance that someone tries to build Postgres in an
AArch64/big-endian environment, we aren't pretending to support it.Is that actually a meaningful benefit, ie can we remove any code
anywhere?
I'm not aware of existing code that is broken, but some of the stuff I'm
intending to commit soon to optimize hex_{encode,decode} might be sensitive
to endianness. In any case, I don't think it's outside the realm of
possibilities for architecture-specific code.
Given that we don't believe any OS support exists for this
combination, I'm not sure why we should expend configure cycles
on rejecting it. If anyone ever builds such a platform and tries
to run Postgres on it, either it will work or they'll have to start
writing patches. But that applies to a whole lot of not-actively-
tested configurations. I don't see why we should go out of our
way to reject this one.
Okay.
--
nathan
On 02.10.25 23:31, Nathan Bossart wrote:
On Thu, Oct 02, 2025 at 05:08:42PM -0400, Tom Lane wrote:
Nathan Bossart<nathandbossart@gmail.com> writes:
The benefit is that we can safely assume little-endian in AAarch64-specific
code, and on the off-chance that someone tries to build Postgres in an
AArch64/big-endian environment, we aren't pretending to support it.Is that actually a meaningful benefit, ie can we remove any code
anywhere?I'm not aware of existing code that is broken, but some of the stuff I'm
intending to commit soon to optimize hex_{encode,decode} might be sensitive
to endianness. In any case, I don't think it's outside the realm of
possibilities for architecture-specific code.
As a future direction, in cases like this I think it would be better to
write a static assertion. Then the check is self-contained in the C
code, perhaps next to the code it's trying to protect, perhaps with a
comment, and doesn't have to be maintained in some far-away configure
shell script.
On Fri, Oct 03, 2025 at 08:37:02AM +0200, Peter Eisentraut wrote:
As a future direction, in cases like this I think it would be better to
write a static assertion. Then the check is self-contained in the C code,
perhaps next to the code it's trying to protect, perhaps with a comment, and
doesn't have to be maintained in some far-away configure shell script.
Makes sense, thanks.
--
nathan