Improve CRC32C performance on SSE4.2
This patch improves the performance of SSE42 CRC32C algorithm. The current SSE4.2 implementation of CRC32C relies on the native crc32 instruction and processes 8 bytes at a time in a loop. The technique in this paper uses the pclmulqdq instruction and processing 64 bytes at time. The algorithm is based on sse42 version of crc32 computation from Chromium's copy of zlib with modified constants for crc32c computation. See:
https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
Microbenchmarks (generated with google benchmark using a standalone version of the same algorithms):
Comparing scalar_crc32c to sse42_crc32c (for various buffer sizes: 64, 128, 256, 512, 1024, 2048 bytes)
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------------------------
[scalar_crc32c vs. sse42_crc32c]/64 -0.8147 -0.8148 33 6 33 6
[scalar_crc32c vs. sse42_crc32c]/128 -0.8962 -0.8962 88 9 88 9
[scalar_crc32c vs. sse42_crc32c]/256 -0.9200 -0.9200 211 17 211 17
[scalar_crc32c vs. sse42_crc32c]/512 -0.9389 -0.9389 486 30 486 30
[scalar_crc32c vs. sse42_crc32c]/1024 -0.9452 -0.9452 1037 57 1037 57
[scalar_crc32c vs. sse42_crc32c]/2048 -0.9456 -0.9456 2140 116 2140 116
Raghuveer
Attachments:
v1-0001-Add-more-test-coverage-for-crc32c.patchapplication/octet-stream; name=v1-0001-Add-more-test-coverage-for-crc32c.patchDownload
From 2aa9c6e28bbc956e0d4cc7aea5e1d1871b876a6d Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 4 Feb 2025 12:56:00 -0800
Subject: [PATCH v1 1/2] Add more test coverage for crc32c
---
src/test/regress/expected/crc32c.out | 42 ++++++++++++++++++++++++++++
src/test/regress/parallel_schedule | 2 ++
src/test/regress/sql/crc32c.sql | 12 ++++++++
3 files changed, 56 insertions(+)
create mode 100644 src/test/regress/expected/crc32c.out
create mode 100644 src/test/regress/sql/crc32c.sql
diff --git a/src/test/regress/expected/crc32c.out b/src/test/regress/expected/crc32c.out
new file mode 100644
index 0000000000..f25965df4a
--- /dev/null
+++ b/src/test/regress/expected/crc32c.out
@@ -0,0 +1,42 @@
+--
+-- CRC32C
+-- Testing CRC32C SSE4.2 algorithm.
+-- The new algorithm has various code paths that needs test coverage.
+-- We achieve that by computing CRC32C of text of various sizes: 15, 64, 128, 144, 159 and 256 bytes.
+--
+SELECT crc32c('');
+ crc32c
+--------
+ 0
+(1 row)
+
+SELECT crc32c('Hello 15 bytes!');
+ crc32c
+------------
+ 3405757121
+(1 row)
+
+SELECT crc32c('This is a 64 byte piece of text to run through the main loop ...');
+ crc32c
+-----------
+ 721494841
+(1 row)
+
+SELECT crc32c('This is a carefully constructed text that needs to be exactly 128 bytes long for testing purposes. Let me add more words to ....');
+ crc32c
+------------
+ 1602016964
+(1 row)
+
+SELECT crc32c('This is a text that needs to be exactly 144 bytes long for testing purposes. I will add more words to reach that specific length. Now we are ...');
+ crc32c
+------------
+ 1912862944
+(1 row)
+
+SELECT crc32c('This is a precisely crafted message that needs to be exactly 159 bytes in length for testing purposes. I will continue adding more text until we reach that ...');
+ crc32c
+------------
+ 1245879782
+(1 row)
+
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 1edd9e45eb..7c9dbf65db 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -56,6 +56,8 @@ test: create_aggregate create_function_sql create_cast constraints triggers sele
# ----------
test: sanity_check
+test: crc32c
+
# ----------
# Another group of parallel tests
# aggregates depends on create_aggregate
diff --git a/src/test/regress/sql/crc32c.sql b/src/test/regress/sql/crc32c.sql
new file mode 100644
index 0000000000..5e481eab6f
--- /dev/null
+++ b/src/test/regress/sql/crc32c.sql
@@ -0,0 +1,12 @@
+--
+-- CRC32C
+-- Testing CRC32C SSE4.2 algorithm.
+-- The new algorithm has various code paths that needs test coverage.
+-- We achieve that by computing CRC32C of text of various sizes: 15, 64, 128, 144, 159 and 256 bytes.
+--
+SELECT crc32c('');
+SELECT crc32c('Hello 15 bytes!');
+SELECT crc32c('This is a 64 byte piece of text to run through the main loop ...');
+SELECT crc32c('This is a carefully constructed text that needs to be exactly 128 bytes long for testing purposes. Let me add more words to ....');
+SELECT crc32c('This is a text that needs to be exactly 144 bytes long for testing purposes. I will add more words to reach that specific length. Now we are ...');
+SELECT crc32c('This is a precisely crafted message that needs to be exactly 159 bytes in length for testing purposes. I will continue adding more text until we reach that ...');
--
2.43.0
v1-0002-Improve-CRC32C-performance-on-SSE4.2.patchapplication/octet-stream; name=v1-0002-Improve-CRC32C-performance-on-SSE4.2.patchDownload
From 661d6a7ae5ba02f963561513bea4a003d789cdda Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 4 Feb 2025 15:20:13 -0800
Subject: [PATCH v1 2/2] Improve CRC32C performance on SSE4.2
The current SSE4.2 implementation of CRC32C relies on the native crc32
instruction and processes 8 bytes at a time in a loop. The technique in
this paper uses the pclmulqdq instruction and processing 64 bytes at
time.
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
V. Gopal, E. Ozturk, et al., 2009
The algorithm is based on crc32_sse42_simd from chromimum's copy of zlib. See:
from https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
Microbenchmarks: (generated with google benchmark using a standalone
version of the same algorithms).
Comparing scalar_crc32c (current version) to sse42_crc32c (proposed new
version):
|----------------------------------+---------------------+---------------+---------------|
| Benchmark | Buffer size (bytes) | Time Old (ns) | Time New (ns) |
|----------------------------------+---------------------+---------------+---------------|
| [scalar_crc32c vs. sse42_crc32c] | 64 | 33 | 6 |
| [scalar_crc32c vs. sse42_crc32c] | 128 | 88 | 9 |
| [scalar_crc32c vs. sse42_crc32c] | 256 | 211 | 17 |
| [scalar_crc32c vs. sse42_crc32c] | 512 | 486 | 30 |
| [scalar_crc32c vs. sse42_crc32c] | 1024 | 1037 | 57 |
| [scalar_crc32c vs. sse42_crc32c] | 2048 | 2140 | 116 |
|----------------------------------+---------------------+---------------+---------------|
---
config/c-compiler.m4 | 7 +-
configure | 7 +-
meson.build | 7 +-
src/port/pg_crc32c_sse42.c | 127 ++++++++++++++++++++++++++++++++++++-
4 files changed, 141 insertions(+), 7 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 8534cc54c1..8b255b5cc8 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -557,14 +557,19 @@ AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}],
diff --git a/configure b/configure
index ceeef9b091..f457e3d3bc 100755
--- a/configure
+++ b/configure
@@ -17178,14 +17178,19 @@ else
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00);
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/meson.build b/meson.build
index 8e128f4982..070f51a440 100644
--- a/meson.build
+++ b/meson.build
@@ -2227,15 +2227,18 @@ if host_cpu == 'x86' or host_cpu == 'x86_64'
prog = '''
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
-__attribute__((target("sse4.2")))
+__attribute__((target("sse4.2,pclmul")))
#endif
int main(void)
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..69f8917c7d 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,13 +15,13 @@
#include "c.h"
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
-pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
+static pg_crc32c
+pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
{
const unsigned char *p = data;
const unsigned char *pend = p + len;
@@ -68,3 +68,124 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+/*
+ * Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
+ * Instruction" V. Gopal, E. Ozturk, et al., 2009
+ *
+ * The algorithm is based on crc32_sse42_simd from chromimum's copy of zlib.
+ * See:
+ * https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
+ */
+
+pg_attribute_no_sanitize_alignment()
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
+{
+ ssize_t len = (ssize_t) length;
+ const unsigned char *buf = data;
+ /*
+ * Definitions of the bit-reflected domain constants k1,k2,k3, etc and
+ * the CRC32+Barrett polynomials given at the end of the paper.
+ */
+ static const uint64_t pg_attribute_aligned(16) k1k2[] = { 0x740eef02, 0x9e4addf8 };
+ static const uint64_t pg_attribute_aligned(16) k3k4[] = { 0xf20c0dfe, 0x14cd00bd6 };
+ static const uint64_t pg_attribute_aligned(16) k5k0[] = { 0xdd45aab8, 0x000000000 };
+ static const uint64_t pg_attribute_aligned(16) poly[] = { 0x105ec76f1, 0xdea713f1 };
+ if (len >= 64) {
+ __m128i x0, x1, x2, x3, x4, x5, x6, x7, x8, y5, y6, y7, y8;
+ /*
+ * There's at least one block of 64.
+ */
+ x1 = _mm_loadu_si128((__m128i *)(buf + 0x00));
+ x2 = _mm_loadu_si128((__m128i *)(buf + 0x10));
+ x3 = _mm_loadu_si128((__m128i *)(buf + 0x20));
+ x4 = _mm_loadu_si128((__m128i *)(buf + 0x30));
+ x1 = _mm_xor_si128(x1, _mm_cvtsi32_si128(crc));
+ x0 = _mm_load_si128((__m128i *)k1k2);
+ buf += 64;
+ len -= 64;
+ /*
+ * Parallel fold blocks of 64, if any.
+ */
+ while (len >= 64)
+ {
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x6 = _mm_clmulepi64_si128(x2, x0, 0x00);
+ x7 = _mm_clmulepi64_si128(x3, x0, 0x00);
+ x8 = _mm_clmulepi64_si128(x4, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x11);
+ x3 = _mm_clmulepi64_si128(x3, x0, 0x11);
+ x4 = _mm_clmulepi64_si128(x4, x0, 0x11);
+ y5 = _mm_loadu_si128((__m128i *)(buf + 0x00));
+ y6 = _mm_loadu_si128((__m128i *)(buf + 0x10));
+ y7 = _mm_loadu_si128((__m128i *)(buf + 0x20));
+ y8 = _mm_loadu_si128((__m128i *)(buf + 0x30));
+ x1 = _mm_xor_si128(x1, x5);
+ x2 = _mm_xor_si128(x2, x6);
+ x3 = _mm_xor_si128(x3, x7);
+ x4 = _mm_xor_si128(x4, x8);
+ x1 = _mm_xor_si128(x1, y5);
+ x2 = _mm_xor_si128(x2, y6);
+ x3 = _mm_xor_si128(x3, y7);
+ x4 = _mm_xor_si128(x4, y8);
+ buf += 64;
+ len -= 64;
+ }
+ /*
+ * Fold into 128-bits.
+ */
+ x0 = _mm_load_si128((__m128i *)k3k4);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x2);
+ x1 = _mm_xor_si128(x1, x5);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x3);
+ x1 = _mm_xor_si128(x1, x5);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x4);
+ x1 = _mm_xor_si128(x1, x5);
+ /*
+ * Single fold blocks of 16, if any.
+ */
+ while (len >= 16)
+ {
+ x2 = _mm_loadu_si128((__m128i *)buf);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x2);
+ x1 = _mm_xor_si128(x1, x5);
+ buf += 16;
+ len -= 16;
+ }
+ /*
+ * Fold 128-bits to 64-bits.
+ */
+ x2 = _mm_clmulepi64_si128(x1, x0, 0x10);
+ x3 = _mm_setr_epi32(~0, 0, ~0, 0);
+ x1 = _mm_srli_si128(x1, 8);
+ x1 = _mm_xor_si128(x1, x2);
+ x0 = _mm_loadl_epi64((__m128i*)k5k0);
+ x2 = _mm_srli_si128(x1, 4);
+ x1 = _mm_and_si128(x1, x3);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_xor_si128(x1, x2);
+ /*
+ * Barret reduce to 32-bits.
+ */
+ x0 = _mm_load_si128((__m128i*)poly);
+ x2 = _mm_and_si128(x1, x3);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x10);
+ x2 = _mm_and_si128(x2, x3);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x00);
+ x1 = _mm_xor_si128(x1, x2);
+ crc = _mm_extract_epi32(x1, 1);
+ }
+
+ return pg_comp_crc32c_sse42_tail(crc, buf, len);
+}
--
2.43.0
On Thu, Feb 6, 2025 at 3:49 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
This patch improves the performance of SSE42 CRC32C algorithm. The current SSE4.2 implementation of CRC32C relies on the native crc32 instruction and processes 8 bytes at a time in a loop. The technique in this paper uses the pclmulqdq instruction and processing 64 bytes at time. The algorithm is based on sse42 version of crc32 computation from Chromium’s copy of zlib with modified constants for crc32c computation. See:
https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
Thanks for the patch!
Microbenchmarks (generated with google benchmark using a standalone version of the same algorithms):
|----------------------------------+---------------------+---------------+---------------|
| Benchmark | Buffer size (bytes) | Time Old
(ns) | Time New (ns) |
|----------------------------------+---------------------+---------------+---------------|
| [scalar_crc32c vs. sse42_crc32c] | 64 | 33
| 6 |
| [scalar_crc32c vs. sse42_crc32c] | 128 | 88
| 9 |
| [scalar_crc32c vs. sse42_crc32c] | 256 | 211
| 17 |
| [scalar_crc32c vs. sse42_crc32c] | 512 | 486
| 30 |
| [scalar_crc32c vs. sse42_crc32c] | 1024 | 1037
| 57 |
| [scalar_crc32c vs. sse42_crc32c] | 2048 | 2140
| 116 |
|----------------------------------+---------------------+---------------+---------------|
I'm highly suspicious of these numbers because they show this version
is about 20x faster than "scalar", so relatively speaking 3x faster
than the AVX-512 proposal? I ran my own benchmarks with the attached
script using your test function from the other thread, and found this
patch is actually slower than master on anything smaller than 256
bytes. Luckily, that's easily fixable: It turns out the implementation
in the paper (and chromium) has a very inefficient finalization step,
using more carryless multiplications and plenty of other operations.
After the main loop, and at the end, it's much more efficient to
convert the 128-bit intermediate result directly into a CRC in the
usual way. See here for details:
https://www.corsix.org/content/alternative-exposition-crc32_4k_pclmulqdq
The author of this article also has an MIT-licensed program to
generate CRC implementations with specified requirements (including on
ARM):
https://github.com/corsix/fast-crc32
I generated a similar function for v2-0004 and this benchmark shows
it's faster than master on 128 bytes and above, which is encouraging
(see attached graph). As I mentioned in the other thread, we need to
be mindful of the fact that the latency of carryless multiplication
varies wildly on different microarchitectures. I did the benchmarks on
my older machine, which I believe has a latency of 7 cycles for this
instruction. Looking around *briefly*, the most recent x86 chips with
worse latency seem to be from around 2012-13 (e.g. Intel Silvermont
and AMD Piledriver). Chips newer than mine have a latency of 4-7
cycles. It seems okay to assume those who care the most about
performance will be on hardware less than 12 years old, but I'm open
to arguments to be more conservative here.
Large inputs would make the graph hard to read, so I'll just put the
results for 8kB here:
master: 1504ms
v1: 543ms
v2: 533ms
Some thoughts on building (not a complete review):
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -557,14 +557,19 @@ AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
It's probably okay to fold these together in the same compile-time
check, since both are fairly old by now, but for those following
along, pclmul is not in SSE 4.2 and is a bit newer. So this would
cause machines building on Nehalem (2008) to fail the check and go
back to slicing-by-8 with it written this way. That's a long time ago,
so I'm not sure if anyone would notice, and I think we could fix it
for people using a packaged binary by having a fallback wrapper
function that just calls the SSE 4.2 "tail", as 0002 calls it.
--
John Naylor
Amazon Web Services
Attachments:
crc_results_20250209.pngimage/png; name=crc_results_20250209.pngDownload
�PNG
IHDR ] T e�~ IDATx���yXTe��� �.� ���
h/���)F���n������YVf�o���e��������������!���"
���0���r~�p�3�������s?�<��yf�}��sTB""""2��&��QM������H,�����������H,�����������H,�����������H,�����������H,�����������H,�����������H,�����������H,�����������H,���������*���{��O������������;6m���n��QP�T��k�F����/�<���DGG���m���l�z�j�l���������p��E]\�����?G�f�`ccOOO4.\(�___2���EFF>����~������?�M�����R�<�{fff��9���_����s�R���C��_|u��������#�Cdl,��Hq�f�B�N�p��M����X�~=�O���������8q�^{���[���k�������W�^����k�����K/��������A���E�0t�P<��3������sq��Q���"'' 0}�tL�:�&M�����a�$''#$$w��}��x���31|�p ����P�=�_��_�~����/�@�����htmw�J�������%���U�`mm]�mU9��HA���� F�!�Zm����� �s�N!�#G����en���X<��S�~���~YV�Z%��k'�^�*�v�*Z�n]��V������_�[�e�@��� !����zm���@�����9!���O�u�3u�T��e�r����E� b��-B�={0���['''�x��R�'N "44T�o�^��W�^���Sv��Z��g��HQ�}�����x�b�T�2���:w�\�������uk$''C��l�!!!��w/���/�����K��y{{ n�����R������=�����/F��
aaal��Q������:u�g�}���x�T*��;�����m�� �+W� ~���h��gO|����b+W�DPPj��]��AT���""�������U�m�����M�2������s�W�LLL�}�r�R�Th��!�����<..&&&x��g QQQ����s�Nh�Z������?����w�^����obcc�r�J���0` K���i��y���;x�����~Y.]��������F�ADD�=�s�������`��52dJJJ*�:��]D��;w����
6|�u5��#99o��8����?��
��?��{���7�x�5 ���{x���
��].\���;accS�6�����?�C�h��5V�^���DGG�jkoo�Z
333���HI���}�������c������F��]��!!!�_���l��m��������_"���""���=z��j�VRR�Z���_�>V�\�O>�o���cm�����`������������p�B;v��o���^x����B�� 33S�x�W�����U������;<==�����9?�_����������;v�����"����C������b���yh������S�vm���#))���355��C�t��_��~��a��qx���{��k���x���1e������f�s��y����{��8
���>��st��U�\(777��`�_��lmm�����y?�_-,,��A�������>�����Y��}�v�m��6m*s6�������*((��7GT���""���� 44qqq������c�6%%%�>}:F��wH���A
����1s�L����h���co��.\���'��o����#�bIII(..�SO=�����u����s����o�����,233K����Qj�U��%������h��=���q��MX[[#<<�������}�6rrr�
�������Qe��E"R��������#F�9[1y�d|��'�_�I����Q�NDFF�]O�P���o�����w��*��Oq ������,����������������U+�����(,,�=ONN����e�.!�c��Sj�FFF��_�����K/I�V������K�v���c���^p���p���u��S�NE�-0j�(4h� �n���U�p��A��=!!!������W_}���p|��gx��w
����O#== p��]���`��= �:u��������0aZ�n�z�����yxx�q������Q�n]t�����abb�1c���F!,,,0b����;P�T�:u*���Q�:NNN8p� >WWW4h�@���Q�}*�����Zj[������S��0 ���:�n��?��C2�^x�;�����$���B�V��?�������EL�2Eo���Cdp�}�0"�����' ���+,,,D��uE�~��������wa�sssq���
m�,aaa@�����!��p��d b��1B!rss��,����pww���">>��<���'��+,X ����Z�M�6�]��>]^�������b�������C�=��/������#E``������Oo����.�*�%%%b���"88X��UKX[[�f����3g����
�Cdd����������A3yN�Xt)�E�Xt)�E�Xt)�E�Xt)�E�Xt)�E�Xt)�E�Xt)�E�Xt)�E�Xt)�Z]���Z�hGGG������������;;;���`�����+�,����y�&����Y�f!##�����o��c������z���Q�����U�V��W^Abbbe�MDDD5D�)��������/ �����qc$&&b��}�����1c`bb�v���G�X�n]e�MDDD5D�)�<==��o_ @qq1~��g����s��8w���������!!!���%""�����0�5k�`��!prr��e�������\XZZ�����Fnnn����� ��[�R�J-#""����������N��Q����^z ��������� ���-�����eff�V�Z��+�,66���F������bcc+;��jsx1!!?��3 ����Z�Bxx8�����g���V�>}��5�������&�6EWVV����� �\��_�-Z�@���V��p�Bh4����w�FDDDe�MDDD5D�)���m�/���G����:v���b��1P����ibbb`oo��c�b���������������V�tEFF"22��X`` �9�xNDDDD�N3]DDDDU�.""""��""""R �.""""��""""R �.""""��""""R �.""""��""""R �.""""T�� UE����p5M�MPcw����������������B�)��m��4�����D���E""""��""""R �.""""��""""R �.""""��""""R �.""""��""""R �.""""���DDT� ��Q[ *����.""��n]����_<P�W������E""""��""""R �.""""T��k��}h��������b666����{�?�R�%""�����H������p,Y�����'��C#88yyyHMM���ke�JDDD5P�)�����h�"< $%%�Q�F �J�����v�|
�_��{�W4#���M���������c������$<����������
��������������Z�9��IJ��U�K����e�E��6E����qaaa�5k6l��W�b��!6lbbbp��Qt��
������qqqB��fll���������������r��{�n�:&V8���s�m�m�6�Z�V����]�u��10 |��n����+V���i��5��M�6�*����Jm366���
dODD�vMv���[YYY������c]���5����m�v�
G[�
�QNT-���:|�0������h<��3������~�:Z�l�[VTTss�J�����j�js����<8��/�+� �������< 8t���Y�>}�TR�DDDT�T���-[�������������(,X� �-��#���
www,\������/�,������_�'�����HEs""""���^$"""��Xt)�E�Xt)�E������� �������T@�FJfDd4,������������)�]���DO
^$"""R �.""""��""""R �.""""��""""R �HDDF���s8��"�U��g������"""����e��vJ2������j^$"""R g����� �U� m�t��_*��.""�*�b��@n�t� ]O8^$"""R �.""""��""""R �.""""�Dz"�yv�
�����z�����j ]DT�|�>p7U:>e1�."28^$"""R g���j�S�s���.�@�JfDT���""�����\:����������}���U�Vptt���7.\�����#88vvv������k+5W"""�Y��LWff&����d�<'O�D���-[�W�^x���1z�h:t��uC������W�����i�+���_@A1l���f�6EWQQ-Z��� ������$������c�� �k�=z���u���Tr�DD�1��M8u��d�P�;h�hFD5[�9����������'''#)) �>�,��;��~~~HHH��L����&�63]�v��
���a��Yh���n���������[j���8!J-���5j�D����BX��O�<�d���Gvv�l<==]6����������|]���L��q����3we�����nc����uL�p����{��b��2�'N�F��L���]�u��10 |��n����YYYz�233Q�V�R�����Z���p#fMD���$@�o``` �=���wc�i�qggg@�&rppx���������>�����T�����8��!����$� ���;u}+����k�O�e�����B���O?��Cm_p��j�6�������������j���g�f�N�>�f��UR�DDDT�T��+//������3���:v��Z��B��`����{7"""*-_"""�Y�M��e�\�z=z������1q�D��jl�� 111������c�z�jx{{Wv�DDDTCT�s����W� �����#��DDDDt_���""""��Xt)�E���9]D����,}-<=S� ���m� �n�_����jS��!�������S���)�&��+wWu�-��,�m������i�������,s�R��P%3�'�.""��
��j�f Xt�<E����c�������1c.]��D�DDDDU����e��a��iB`����9s&�����[7�{"""�*���7o��e��A�Ra���?>�w����$$$�y���N����q?�p��t� �MW%S""#S��.SSS���������Z�V������c���2w��5�EQ5c���W�^������)�
+++l����G@@���'"""��^t�9
6DVVz�� (((@ll,,--��=Q�������������UF+�"""p��a��V�Ett4��mk�����1�_��[ws$���<P��Z��>�Z�-Z(��!����������/b��e=z4����{��a��-�s������j;�xm_��l����g���������������W�����[�n���k�����V�������������c����;w.����V�Z�K�.�7o������dcuOD5�B@�AD����+++|��7 A��u�h�"��=�:uB�
�n�: ��K�ooo��W&L�����j1a�����Q�F
�����g#::���>�|�M ���7��O4i��6m4
���s�����'��__�{�oF��^aa!._���L�����[(((0v�DDD��Z������`��]X�d �|�M��]{����9s0}�t ���c��eK\�r'N���u����?���f���HHH����G��~�[o��-Z`����7o `����W�q��Q�����- XZZb��]�t�
T�{�oF?����^C�^���u���� G���������==�T �����n�9+�=���{ ���Q\\������_�~ ���?������l�.\ xxx����X�jz�������{����IOO��]��r�J�T*���`���X�r%����R���_�*s��]�;w��K��e�������
�F�B�.]t;����
�� ��Y��!z8* �����^
��W*#z����?G�yII `��=�7o���abb�����c�� �f��a��
����1e�4n����G������[� @o�F�����������_��2z�u��E���8w�
�b��������t��K��o���"W�!�,==�z���?���={ �v���s�=���{�K�,A�>}�������� g��g;U*��^��2����3��U+|���03���A�����������,#88 ��'O�����8v����K���! @w����9222 ����������O?�Z��g����'�Z���lF/����0i�$��"=�xBn����b�6l�7�x���ppp@���1g��7M�6���{�n4l�fffpttDLL ����<y2����v�Z�\�������JJJ��_~��/���^t�5
�-��)SJ�tQ����G;��N�:�) ����{�����r���`����=;]fQQQ����=����O?�Tf����G����^�����~�G}T�x+��"E��\�9��a3�c9�:=&�]S�N����^f��t�"�n�m����c�EDDFg����������������)'QM`�e���������A�f����___����(�-]�������/��������R�q�������������f��#������a���O?����HMM���JJJ�s�N�=g�������N���l4m�Toyqq1������Z�.�FT���!-;_2���OE3"�F���Vv��^t�8q,�R455���?�6m� 11���.���_Fpp0�{�9����� ��ED����F��"���R4�
�� \\-�i��P2#"��~x����[����S�N�����}�����222`ff�a���n��h��)�,Yb����1�
��E���!=��>�5e�t����/���7�����~:��E�2��
CLL�=�n�����Cw��������)r_ll�"����LI��H�q�N�������w����8�
�q2��l�����m�����nU8��[y��),*��L���8���������������w������(�r�
�<������������333��1sHN�.�GF�l\��n�r��!�dH������������M�,������xXz_���[n�
�e�p��2KIIA����]�$S���/^�������[[[����x���������V�X�{��uk:�6m*Ut����Z?66�����JF�pA:��� �����6�7 {�{/�V�v�����j���o:t ��*������d�X�[ 2����#�+�q777���3�-���l�v��m����7�s\��H+��;;;25����c���$HD��M����ym������5q�-=.����,�>����������_m�}e{���e�>�ff@�t����U����;�[��� ��E�K�o��M�6�K/� �>}:\]]����)�����~�:Z�l�[VTTss��S�p6%@�<�6�� ��=���\��_k�>���_s�����-[��4L�O�]��-��E��v�Z�2��������>�/_���?�m���]�v8t���YS#�p"��f��|)��e4`������8��� IDAT 3[���C]111������cG�����:2�������/_��� ��3f�A�8sF~��Q�������{�nL�8���?~<��m�E�a��ppp��#�p�B����""���m�����/t�KJJP�N���M�6������Q�J��2}��N�:8���/sssq������SSS�M4����H��G�H~�~^*w�|�J����n��!X�v-&L� ��{7������Ty��{�s������P�n]���������P��.�J��
$����������I0p�@L�4 7o������]���\�P���III��e�]�[[[L�>���������������P�����������o���N��P������������`gg�_
�$Gw2'L6lx6T2#""2���|���������W�[��dF/�4
�z�-|����������������k���*��o ��I�_��?^������$��_w$�Og�(���DD��W�^;v,���;6����R�^t��;���HLL���c �{�F�V�0p�@�=��������7��O�-x2�.�|�B�t�q���) "��:h�J:����"kkk���k���W_}�[��K8p %%%�j����D�&Mp��Ice_���:x� ����2'''�i��.]b�ED���P����[���/}����r�J�\�Ro���;
�����?'�[�n������G��A����-����3]o��6��o��+W"99S�LAbb"F�WWWcwODDDT%���_�>.\����8\�zvvvh��-�@DDD5�"����?��� �������aii %�'"z��h��,���0�U2#"zL���z��e���V�Z���������u���g���]DDe��8�\:n^h�W����1����y3�-[�J�+V`������;������e�T����� E/���"=='N�@��� Bh�Z%�'zB&�R0""���^t���055��a�`ee�-[�����0v�D��������=����Q5d��k���h��!�����G @AAbccaiii����Q^Le.��vLl������P����C�����G�n��Q��
%��z_��B5=��,��o/�7 �����5�}������g�U����C�V�E�Z�Q]�t �����#G`nn���{���?���9�����k�����033C��=�h�"XXX<r?F+�"""0x�`�������e����F������b.��@N~�d�'������ �O�vI��"J-���!C�T��o���[�6X�URRSSS�x�>}��W/���"--
]�t�_|���'c��������7o";;����7o�M���y���>}:\]]����{�����I��7�$z���{O]��o}�&�)���z���.��w����g����b����������}�]�1 �|�r��3GWM�<�F��������X�;w���������x��Wq��qXYYa���>|8���0n�8|����7o
Tf���c��I���?LMM������{����?�I������...
ERRR�������>�Hr��>�tU7n���������P���"&&aaa��o�t����p���`��18�<6l���d�;C��[o��_~����5jT�}:�<�6o����$�l�-[����%v���K�.��!�j;|�p��++�>}�7o��� �����U\\��~�
�'O���1Z����"''G�������'""�Jdaa�k����� ��cG��j���_
���+�������h�� 6o���}���a��=X�f
T*���p��U888@�R�_�~��K�- ��q���P�T�6m����"22�5���S�],�*���������y��w�}�������YTT�V���s��Ahh(LMM1m�4�=�����- ����n�������������EEE�r�
���I�&a��� ���o�O�>���+�'�f��yx��C����;(�� �7o��Y�p��Q4h� Z������F�a����?�h�����o�H��tss ����n�� ���/��� �R��m ;v������R���A�=��M����q�������W^�� k���"��;w���
///h�Z�����J�BNNN�<�7�x6l������`aa!��;���9222���v��h��=,X�9s�����B��-�wo���J����_���o#??���Gnn.���u�������/���xx�*��~ a�t��-�����}�o N2�/Y5����k�����/�������c���>|8�o��N�:!((B���c���h���n���'#>>k���Mq���������',--1g�>t�6m� ..S�L��~�Z���|��HKK��
`nn��s�����iS��������tQ�R� �<��(P2""�7K�G^��� `���z��}�b���������>(sQQQ���z������u�V�<�k��m[��������}3E��������h4�6���7V�DDDDU�����>��e��777���Rm���
����K1i�$|���7n�ny||<^}�U$%%�v���5k
d������00x�`��G}��={*���2��cbbp��Q��?������Bvv����}��������}�=�B�n���ys���=/"""z4���?+;
�2z���Q#�����/#88�=����}�����c�� �k�=z���u�$�'��c��k�������V4�
0@�!�6�O�G��*U
F/��M����P�h�B��d�gf����`}�����s�f����p����M�g����7������]���K�m��y$����E�| {{{xxx�:�K���������
�}������-�6..������X��hh��"�PN�?�����F�#$/O��IOO���t<--
�����������-'�(/..��r�����������J��k�.�i\$�W���]?++������J&~��q�����>�pWv������p/�j�J�/]��s�����[2���GaQ����{7N2755GOK��))v?+;Kv����g�x^^v�3.���e������{�y+ ��mC�Vzo&'_�]?3C�;��1 �BZ��5������������gTf�911K�����{��b��2���8�������SO������EWJJ
���P�V��n���������2s
+�,66���F���
��E�M��mk�+��y����{���d����A���lYP,=�����H�����j������h�������3�,����5 ��($$Dv�k�_��S�D����[�]�}�f�z�E�ha_�3r�m�I`�
��Z����V���T����|�H�.�n ?\����������e�����[I�P����g�������qg''�
kk�r������4���8;;25��m-�~�t��0�����$HD�����}Eq)m��l�K�����n�qieeU������V�u��������a{���e�>�ff@�t������qwww�����I�0��L��ADDbbb;��,8{��^�O�F�f�u#"""2$�]����L�;;;x{{�=*��a�g��B��`����{7"""���2Y��(�H[Z�UF?�8u�T����e�4h`�~JJJ`cc�s����{1y�d�=�-��M�0z�h��������������m���H��H��W�q��"""�I�^t)uFSSSH���#G�(��
�;�� <q�UF/�j��'������� X�(�U2���EDDDD,�����������H<��*�����)Y�BG-�.z�J��-������������Rx���S���r��%""�' ���rs����n���Co%3""""#��E""""��""""R �.""""���'���W03z�d��T���fDDDDrXt=�R3s��t�d�Q��gU%<�HDDD� ]DDDD
���
�x3�ngK�������fDDDDU��
����X���xT�B|�hFDDDT���"�Xt)�E�Xt)�E�Xt)�E�Xt)��]666����{�?���"""��F\����yyyHMM���ke�CDDD5P������ 888Tv*DDDTC���+##fff6l�����M�b��%��� 5�������a��!&&G�E�n������={��������6bccu��|9E�����rs*)��T%���?q=G-?��)�}�V[n���`g'�x�"��J�od��GAAle����pv�������� @����322P����L����b�sHOO�x~~�d[ �v�B��E2~��M�������B@fX������[��q������/...7�{9�PKf�\�t ��I�I�������Pv���'k�xjj*����AS"�������w�w��Y&����]��������99�d?� �m�6k��fr�u��33����� p��!�dH������������g�B:�����������s���X���L<##���������T�jD�����+V���n�C���M�J]aaa�����Exx�����;�?�$����n��djjh�$���7Gs�p�x��Y`[�d����IL;�Z�q___��I��t=X)�cKKK��t�����������s �eI@���tttd�dGGG�5P��z�Y���nH�:;;�o����E!!!�y}�������������d��J�d��-Z��}���}��'�7$�j�(��[-�Z R%�>>>��#�~����pI�s@�6srr��nnno%�C�F,>#��������������u������@Z�t���LMdk[�|���kW�T�o?%� ���� ������h��
`�^2~MvK�K++��>~F������~��������.���63d�/��w�U����;�[��[NaN���SSS�������r�� """2�Qt]�|�:u��������Y�>}�TvjDDDTC����m����E�0b�������.Dhhhe�FDDD5D�(� 22������P5��"Qec�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD��St���#88vvv������k+;%"""�AjD�UXX�^�za��Q�����U���+� 11��S#""��F]������%���k�=z���u�*;5"""�!jD�u��9����-���CBBB��DDDD5�J!*; c���Op��q�_�^�l��y��{7bcc�������]��_��|�d���B�|R�f�J���h��������H�`�h"��spRj�d���EZ[��F+��Y$��/2��t���T����DX"_� ����Bhe��z&�`�-�n`m
��a�\��lw���Xz_�V��E�Jo��pV���[\����\
r
K$��*
��L�>�f@n\:B������ws��oS�@�r��p2�J-/o\i�H�.���2`�}%�M����A��DA���L�+3� x��a%d>?6�����V�"OS[����"i��O7U>�����)�(�r5nBz���-Fn��g�E�@�L*�M~~ _��a.�c9��%�A�lp1L��e���Jl$��-n�3.����DH�+�`��te�KSSS������bf�UvJ���EV�~����Z�J��
+�,66���F��<U!���s`U1���bN,P���p��Y����O�Y�f���5�����#�j5.\�F�;v`���������������E�Z���M�{{{�;�W����we�FDDD5D�8� q����N����j�1���6lX�)T�PE�`��AU!��T��
9����� """�d39�EDDD� ]DDDD
`�EDDD� ]DDDD
`�EDDD� ]�����^z ���pss��3�lS�^=����F�a��
����d�],>>�����������]k��\��=z���
������O��3f�@�:u�������{�{��������_~��<99!!!pqq)���}���U+8::���.|�����
�_�~�r���'\\\P�N���+(,�����/_�/� ???4n����CQ��
����������J�[�n=R�9l���5���=��m��� �V���)SP�vm888�o����(�f�2���1~�x������3��'N��������6l �={������5j��~���9@f\���AAAprrB�&M�u�V]���Rn�����w��ppp���+�y��m�9.�r���[��=���W��PTT� &�������������R�����|�
|}}������~;w� =z&&&z����6��$� 2���#F���|q����E��������>>>b��1�?;;[����={�!��v��������S,Y�D�����Z�j�s��<���[��_]h4q��E���)~��g!�_��h���HII���}���~��A�}��W����E�V���E�t�E�������+���������j����B!���Oakk+8P����Epp�6l�����^�C�b������P��sG4o�\��5K!D����3�B����v���O>��B9�������&&&"%%��c�����S���E:tH��>�H�=Z!��_~)����������'
$"""*��B<��S��O?�����������?/���Eff����^^^b������Dl��U�l�RT(�q���*��������Bl��E�����7o|\���~���������<q��M���'�-[&����\r�����{���#�4iR�������m��w��EEE���_���zm����;v�NNN���SB!�/_.���D~~����������z\��>b�e@w�����"--M�Mll�h�������Rt�:uJ����-k���X�t���������^l����/TC��� ����u�>��C1`� !������M���}G�B��K�?n��]�n����{� 55U|���z�����w�}W����OQXX(f��Y��Z�b��s�����)S��a��B899���bo���<xp�r�{�B��hD�6mD\\�055�+��b��a�� b���e���];�j�*������
�a�?&�+��t�"��Y#����~O?�t������q�q�F���s������2���{���B�V���d]l����S�NBx\����B\�~]4h�@�[�����m����G8..N��__�\j�r\�:uJl��U�<77W �/_?���h��m��z\��>��E:y�$<<<���_�q�������t���L�0+V����qv}�&M������� q��Et���������^{???�aC�j���������(**��3g����-Z�A����Bnn�A�.sy�z����Vf������=ONNFRR�}��
�ss�2c����Cm�������
���+��]���deea�������{� 0g�����{���3T�����BBB������p\�| J�S___������JjL���
����A��������a�����V�Za���_.!������z\J���/���u��������!��������Q�0c�xxxT�����y�������#5�
9.�z�)t��M/�F����HKK���>�o����;`�qI�c�e@�v��������
6��O>���� ����k���&M�-sssy�� )IDAT,_�#G�����z�)����h�� rssaii������`�}h��
>��C")) ��=������
�V������������3g0}�t��PQ7n�@XXf��e��gb��ah�����e�u�V������
������C
����g�|�r��<L��222�m�6DGG�������E�����8U�T���4�8��Y�f���������c��q�r�
�z�-���)))��C��q������h�u�V������z�9.{�}/c\�5��b�~�-�j5�
�X�>��/���]�0{�l���o�q�o�>L�4 +W��J����BCC�|�r\�v
#F�@XXn�����$�a�e@NNN���DTTT*0h� l��?��3�����o5�K�.a��A���_����+W�����GLLlmm�����>33�j�2x111����������(���NNNppp��� ��+++899a���z'W�c����g���o��q������o�K�.prr��5k �]�v�k����w�������dH�#F��W_}{{������^z�%���C�Vc���8~�8n��]j�j4���e����{q��]t��U/�:�U�VP�T���4h��q0���kc����?>����u���k��prr��1��,k��}/c\���\���k���O�t��
�� �F��c�b����������T��7��\�b"##�i�&�i��gfq���������F�ooo�����8&��������-f�AQ�/w9)Z�$�pi[K�2R��U�*Wm������t�����Us��F
�2�Y��\A�pYZ��B4�;���!�O=����{>��������9�w{q����]�KLBW�B!���_�~�13���i��m�����8�^{�5544h���I���O?UAA����/I�����h�"���j��9��������7���k�Mj
�TPP��>�HG�U{{�"��JKK��z�����������Q*���_UUUz��w�]�I���~UVV��P8����N�����Tmm�<�222T]]���������C?���jkk�8��Q,SYY���������6YB�������4��3g�.���]r�%*((Hj
�������������>Ma�.\�P_}���=�w�yG�~��JKK�$��d�
���x���{�}!�}9Y
�kii��c�4o�<9��e�����G�������k�C=�H$���6]~���Y^;v�Hz_��a������w��������K===���<yR�����%�H�W���Y�h�=��3622b������c;w��0���nJ������O�n���ff�D����6m�d'O����|��i�����'�|b3f�������q��w��������?7��_��n�:�?�
���i�f�������O;������[~~����&��3-�����l��������/��F���#����K/������h���,�OT��}����.����F������o���f��m����A[�d�=���T�%XH�M7M�=;44dYYY��{���Z����9ns���g_��8���_Z,�p8lW_}��8q"�}9Y���-_�����mhh�z{{-���[����jH��X{���������VQQa�X��c��2�����v�W����~�z+..�_~��b�����[������~����k���l����[ff�9�co����MU��S�����-Z0���W[45;�b/++���+,,����)����VZZj3g��`0�of�����O>iYYY���kO<��?~����F�������i�����5��g+W��_|�|>��������_���o�>������{������4��g^���M�f>��JJJ����$Yzz���������g�}f���-����oG�9�����*t���p8lyyy���m�/�����Q[�f�egg��3�����;v^5�����;��ojj������?�p��{�������233����������!Q_��m�������L�7o^��$�/��{���UUU���^j999���/��MV_&��l��u�����8�3{;��������|�����
co����Fm��U���k~�������/�����DJ��cc�5 `*��� ��. � \@� p� ��. � \@� p� ��. � \@� p� ��. � Lj�����������v�:�� ���� i��� ] 0 B���m����"��5K7n��U��r�����������������F�������n�v�m�={�n��F���O��������P�`P������S6? p�@B]]]�����={�v�Z�r�-��}����$i�������
�������*I��>���}��wz��7�l�29rD�����������Z���3����G���c�=&�����B�p�
�F����R[[�$���Y������������G}T�TVV�Y�f���������A����~�a=��S�� �F��PNNN�k�����-_�\[�n��'�{�n�s�=�������������8:t�����������v�B!�q����uyV �>o� ��m``@��H$���l�~���;w����N-X�@~��qW^y��O��C����655iddD/����}�Y���L�\ ���@B[�l�$�����������
�TTT�^xa�����4E"IR^^����%I���Z�b�v�����*
+--MEEEM�� �=�. ���brG%%%���[�a����J�V�X���>-Y�$>��{�Uuu����%I���z���
�T^^���
UVV�q]s�5
��y�^y���� ��13Ku .>���z��������R �bP��. �lppP���W8Nu) p���"�s��(h���Z�`A����� �^�t ��� �B �] . t ��� �B �] . t ��� �B �] . t ��� �B �] . t ��� ����T �?n���L�Z�� IEND�B`�v2-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From 5b329ccf89986ab5e6dd170f8fa317ed206e2137 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v2 2/4] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67..06673db062 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 0000000000..5b747c6184
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 0000000000..dff6bb3133
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 0000000000..d7bec4ba1c
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 0000000000..95c6dfe448
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 0000000000..52b9772f90
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 0000000000..b350caf5ce
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT64(0);
+ int64 num = PG_GETARG_INT64(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 0000000000..878a077ee1
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
v2-0003-Improve-CRC32C-performance-on-SSE4.2.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Improve-CRC32C-performance-on-SSE4.2.patchDownload
From 0691791201bae023c2e4da73a6b370f9929f7cea Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 4 Feb 2025 15:20:13 -0800
Subject: [PATCH v2 3/4] Improve CRC32C performance on SSE4.2
The current SSE4.2 implementation of CRC32C relies on the native crc32
instruction and processes 8 bytes at a time in a loop. The technique in
this paper uses the pclmulqdq instruction and processing 64 bytes at
time.
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
V. Gopal, E. Ozturk, et al., 2009
The algorithm is based on crc32_sse42_simd from chromimum's copy of zlib. See:
from https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
Microbenchmarks: (generated with google benchmark using a standalone
version of the same algorithms).
Comparing scalar_crc32c (current version) to sse42_crc32c (proposed new
version):
|----------------------------------+---------------------+---------------+---------------|
| Benchmark | Buffer size (bytes) | Time Old (ns) | Time New (ns) |
|----------------------------------+---------------------+---------------+---------------|
| [scalar_crc32c vs. sse42_crc32c] | 64 | 33 | 6 |
| [scalar_crc32c vs. sse42_crc32c] | 128 | 88 | 9 |
| [scalar_crc32c vs. sse42_crc32c] | 256 | 211 | 17 |
| [scalar_crc32c vs. sse42_crc32c] | 512 | 486 | 30 |
| [scalar_crc32c vs. sse42_crc32c] | 1024 | 1037 | 57 |
| [scalar_crc32c vs. sse42_crc32c] | 2048 | 2140 | 116 |
|----------------------------------+---------------------+---------------+---------------|
---
config/c-compiler.m4 | 7 +-
configure | 7 +-
meson.build | 7 +-
src/port/pg_crc32c_sse42.c | 127 ++++++++++++++++++++++++++++++++++++-
4 files changed, 141 insertions(+), 7 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 8534cc54c1..8b255b5cc8 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -557,14 +557,19 @@ AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}],
diff --git a/configure b/configure
index ceeef9b091..f457e3d3bc 100755
--- a/configure
+++ b/configure
@@ -17178,14 +17178,19 @@ else
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00);
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/meson.build b/meson.build
index 1ceadb9a83..456c3fafc3 100644
--- a/meson.build
+++ b/meson.build
@@ -2227,15 +2227,18 @@ if host_cpu == 'x86' or host_cpu == 'x86_64'
prog = '''
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
-__attribute__((target("sse4.2")))
+__attribute__((target("sse4.2,pclmul")))
#endif
int main(void)
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..69f8917c7d 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,13 +15,13 @@
#include "c.h"
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
-pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
+static pg_crc32c
+pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
{
const unsigned char *p = data;
const unsigned char *pend = p + len;
@@ -68,3 +68,124 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+/*
+ * Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
+ * Instruction" V. Gopal, E. Ozturk, et al., 2009
+ *
+ * The algorithm is based on crc32_sse42_simd from chromimum's copy of zlib.
+ * See:
+ * https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
+ */
+
+pg_attribute_no_sanitize_alignment()
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
+{
+ ssize_t len = (ssize_t) length;
+ const unsigned char *buf = data;
+ /*
+ * Definitions of the bit-reflected domain constants k1,k2,k3, etc and
+ * the CRC32+Barrett polynomials given at the end of the paper.
+ */
+ static const uint64_t pg_attribute_aligned(16) k1k2[] = { 0x740eef02, 0x9e4addf8 };
+ static const uint64_t pg_attribute_aligned(16) k3k4[] = { 0xf20c0dfe, 0x14cd00bd6 };
+ static const uint64_t pg_attribute_aligned(16) k5k0[] = { 0xdd45aab8, 0x000000000 };
+ static const uint64_t pg_attribute_aligned(16) poly[] = { 0x105ec76f1, 0xdea713f1 };
+ if (len >= 64) {
+ __m128i x0, x1, x2, x3, x4, x5, x6, x7, x8, y5, y6, y7, y8;
+ /*
+ * There's at least one block of 64.
+ */
+ x1 = _mm_loadu_si128((__m128i *)(buf + 0x00));
+ x2 = _mm_loadu_si128((__m128i *)(buf + 0x10));
+ x3 = _mm_loadu_si128((__m128i *)(buf + 0x20));
+ x4 = _mm_loadu_si128((__m128i *)(buf + 0x30));
+ x1 = _mm_xor_si128(x1, _mm_cvtsi32_si128(crc));
+ x0 = _mm_load_si128((__m128i *)k1k2);
+ buf += 64;
+ len -= 64;
+ /*
+ * Parallel fold blocks of 64, if any.
+ */
+ while (len >= 64)
+ {
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x6 = _mm_clmulepi64_si128(x2, x0, 0x00);
+ x7 = _mm_clmulepi64_si128(x3, x0, 0x00);
+ x8 = _mm_clmulepi64_si128(x4, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x11);
+ x3 = _mm_clmulepi64_si128(x3, x0, 0x11);
+ x4 = _mm_clmulepi64_si128(x4, x0, 0x11);
+ y5 = _mm_loadu_si128((__m128i *)(buf + 0x00));
+ y6 = _mm_loadu_si128((__m128i *)(buf + 0x10));
+ y7 = _mm_loadu_si128((__m128i *)(buf + 0x20));
+ y8 = _mm_loadu_si128((__m128i *)(buf + 0x30));
+ x1 = _mm_xor_si128(x1, x5);
+ x2 = _mm_xor_si128(x2, x6);
+ x3 = _mm_xor_si128(x3, x7);
+ x4 = _mm_xor_si128(x4, x8);
+ x1 = _mm_xor_si128(x1, y5);
+ x2 = _mm_xor_si128(x2, y6);
+ x3 = _mm_xor_si128(x3, y7);
+ x4 = _mm_xor_si128(x4, y8);
+ buf += 64;
+ len -= 64;
+ }
+ /*
+ * Fold into 128-bits.
+ */
+ x0 = _mm_load_si128((__m128i *)k3k4);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x2);
+ x1 = _mm_xor_si128(x1, x5);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x3);
+ x1 = _mm_xor_si128(x1, x5);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x4);
+ x1 = _mm_xor_si128(x1, x5);
+ /*
+ * Single fold blocks of 16, if any.
+ */
+ while (len >= 16)
+ {
+ x2 = _mm_loadu_si128((__m128i *)buf);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x2);
+ x1 = _mm_xor_si128(x1, x5);
+ buf += 16;
+ len -= 16;
+ }
+ /*
+ * Fold 128-bits to 64-bits.
+ */
+ x2 = _mm_clmulepi64_si128(x1, x0, 0x10);
+ x3 = _mm_setr_epi32(~0, 0, ~0, 0);
+ x1 = _mm_srli_si128(x1, 8);
+ x1 = _mm_xor_si128(x1, x2);
+ x0 = _mm_loadl_epi64((__m128i*)k5k0);
+ x2 = _mm_srli_si128(x1, 4);
+ x1 = _mm_and_si128(x1, x3);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_xor_si128(x1, x2);
+ /*
+ * Barret reduce to 32-bits.
+ */
+ x0 = _mm_load_si128((__m128i*)poly);
+ x2 = _mm_and_si128(x1, x3);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x10);
+ x2 = _mm_and_si128(x2, x3);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x00);
+ x1 = _mm_xor_si128(x1, x2);
+ crc = _mm_extract_epi32(x1, 1);
+ }
+
+ return pg_comp_crc32c_sse42_tail(crc, buf, len);
+}
--
2.48.1
v2-0001-Add-more-test-coverage-for-crc32c.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Add-more-test-coverage-for-crc32c.patchDownload
From d9c69f435cfecf46f2397091e36010af8c965f79 Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 4 Feb 2025 12:56:00 -0800
Subject: [PATCH v2 1/4] Add more test coverage for crc32c
---
src/test/regress/expected/crc32c.out | 42 ++++++++++++++++++++++++++++
src/test/regress/parallel_schedule | 2 ++
src/test/regress/sql/crc32c.sql | 12 ++++++++
3 files changed, 56 insertions(+)
create mode 100644 src/test/regress/expected/crc32c.out
create mode 100644 src/test/regress/sql/crc32c.sql
diff --git a/src/test/regress/expected/crc32c.out b/src/test/regress/expected/crc32c.out
new file mode 100644
index 0000000000..f25965df4a
--- /dev/null
+++ b/src/test/regress/expected/crc32c.out
@@ -0,0 +1,42 @@
+--
+-- CRC32C
+-- Testing CRC32C SSE4.2 algorithm.
+-- The new algorithm has various code paths that needs test coverage.
+-- We achieve that by computing CRC32C of text of various sizes: 15, 64, 128, 144, 159 and 256 bytes.
+--
+SELECT crc32c('');
+ crc32c
+--------
+ 0
+(1 row)
+
+SELECT crc32c('Hello 15 bytes!');
+ crc32c
+------------
+ 3405757121
+(1 row)
+
+SELECT crc32c('This is a 64 byte piece of text to run through the main loop ...');
+ crc32c
+-----------
+ 721494841
+(1 row)
+
+SELECT crc32c('This is a carefully constructed text that needs to be exactly 128 bytes long for testing purposes. Let me add more words to ....');
+ crc32c
+------------
+ 1602016964
+(1 row)
+
+SELECT crc32c('This is a text that needs to be exactly 144 bytes long for testing purposes. I will add more words to reach that specific length. Now we are ...');
+ crc32c
+------------
+ 1912862944
+(1 row)
+
+SELECT crc32c('This is a precisely crafted message that needs to be exactly 159 bytes in length for testing purposes. I will continue adding more text until we reach that ...');
+ crc32c
+------------
+ 1245879782
+(1 row)
+
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 1edd9e45eb..7c9dbf65db 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -56,6 +56,8 @@ test: create_aggregate create_function_sql create_cast constraints triggers sele
# ----------
test: sanity_check
+test: crc32c
+
# ----------
# Another group of parallel tests
# aggregates depends on create_aggregate
diff --git a/src/test/regress/sql/crc32c.sql b/src/test/regress/sql/crc32c.sql
new file mode 100644
index 0000000000..5e481eab6f
--- /dev/null
+++ b/src/test/regress/sql/crc32c.sql
@@ -0,0 +1,12 @@
+--
+-- CRC32C
+-- Testing CRC32C SSE4.2 algorithm.
+-- The new algorithm has various code paths that needs test coverage.
+-- We achieve that by computing CRC32C of text of various sizes: 15, 64, 128, 144, 159 and 256 bytes.
+--
+SELECT crc32c('');
+SELECT crc32c('Hello 15 bytes!');
+SELECT crc32c('This is a 64 byte piece of text to run through the main loop ...');
+SELECT crc32c('This is a carefully constructed text that needs to be exactly 128 bytes long for testing purposes. Let me add more words to ....');
+SELECT crc32c('This is a text that needs to be exactly 144 bytes long for testing purposes. I will add more words to reach that specific length. Now we are ...');
+SELECT crc32c('This is a precisely crafted message that needs to be exactly 159 bytes in length for testing purposes. I will continue adding more text until we reach that ...');
--
2.48.1
v2-0004-Shorter-version-from-corsix.patchtext/x-patch; charset=US-ASCII; name=v2-0004-Shorter-version-from-corsix.patchDownload
From 61295213e3133fd319e0b2bd18e1a9c16a4af140 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Sun, 9 Feb 2025 12:25:56 +0700
Subject: [PATCH v2 4/4] Shorter version from corsix
---
src/port/pg_crc32c_sse42.c | 165 ++++++++++++++-----------------------
1 file changed, 62 insertions(+), 103 deletions(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 69f8917c7d..dec685d139 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -78,114 +78,73 @@ pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
* https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
*/
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2,pclmul")
pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
+pg_comp_crc32c_sse42(pg_crc32c crc0, const void *data, size_t length)
{
- ssize_t len = (ssize_t) length;
+ size_t len = length;
const unsigned char *buf = data;
- /*
- * Definitions of the bit-reflected domain constants k1,k2,k3, etc and
- * the CRC32+Barrett polynomials given at the end of the paper.
- */
- static const uint64_t pg_attribute_aligned(16) k1k2[] = { 0x740eef02, 0x9e4addf8 };
- static const uint64_t pg_attribute_aligned(16) k3k4[] = { 0xf20c0dfe, 0x14cd00bd6 };
- static const uint64_t pg_attribute_aligned(16) k5k0[] = { 0xdd45aab8, 0x000000000 };
- static const uint64_t pg_attribute_aligned(16) poly[] = { 0x105ec76f1, 0xdea713f1 };
- if (len >= 64) {
- __m128i x0, x1, x2, x3, x4, x5, x6, x7, x8, y5, y6, y7, y8;
- /*
- * There's at least one block of 64.
- */
- x1 = _mm_loadu_si128((__m128i *)(buf + 0x00));
- x2 = _mm_loadu_si128((__m128i *)(buf + 0x10));
- x3 = _mm_loadu_si128((__m128i *)(buf + 0x20));
- x4 = _mm_loadu_si128((__m128i *)(buf + 0x30));
- x1 = _mm_xor_si128(x1, _mm_cvtsi32_si128(crc));
- x0 = _mm_load_si128((__m128i *)k1k2);
- buf += 64;
- len -= 64;
- /*
- * Parallel fold blocks of 64, if any.
- */
- while (len >= 64)
- {
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x6 = _mm_clmulepi64_si128(x2, x0, 0x00);
- x7 = _mm_clmulepi64_si128(x3, x0, 0x00);
- x8 = _mm_clmulepi64_si128(x4, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x2 = _mm_clmulepi64_si128(x2, x0, 0x11);
- x3 = _mm_clmulepi64_si128(x3, x0, 0x11);
- x4 = _mm_clmulepi64_si128(x4, x0, 0x11);
- y5 = _mm_loadu_si128((__m128i *)(buf + 0x00));
- y6 = _mm_loadu_si128((__m128i *)(buf + 0x10));
- y7 = _mm_loadu_si128((__m128i *)(buf + 0x20));
- y8 = _mm_loadu_si128((__m128i *)(buf + 0x30));
- x1 = _mm_xor_si128(x1, x5);
- x2 = _mm_xor_si128(x2, x6);
- x3 = _mm_xor_si128(x3, x7);
- x4 = _mm_xor_si128(x4, x8);
- x1 = _mm_xor_si128(x1, y5);
- x2 = _mm_xor_si128(x2, y6);
- x3 = _mm_xor_si128(x3, y7);
- x4 = _mm_xor_si128(x4, y8);
- buf += 64;
- len -= 64;
- }
- /*
- * Fold into 128-bits.
- */
- x0 = _mm_load_si128((__m128i *)k3k4);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x2);
- x1 = _mm_xor_si128(x1, x5);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x3);
- x1 = _mm_xor_si128(x1, x5);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x4);
- x1 = _mm_xor_si128(x1, x5);
- /*
- * Single fold blocks of 16, if any.
- */
- while (len >= 16)
- {
- x2 = _mm_loadu_si128((__m128i *)buf);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x2);
- x1 = _mm_xor_si128(x1, x5);
- buf += 16;
- len -= 16;
- }
- /*
- * Fold 128-bits to 64-bits.
- */
- x2 = _mm_clmulepi64_si128(x1, x0, 0x10);
- x3 = _mm_setr_epi32(~0, 0, ~0, 0);
- x1 = _mm_srli_si128(x1, 8);
- x1 = _mm_xor_si128(x1, x2);
- x0 = _mm_loadl_epi64((__m128i*)k5k0);
- x2 = _mm_srli_si128(x1, 4);
- x1 = _mm_and_si128(x1, x3);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_xor_si128(x1, x2);
- /*
- * Barret reduce to 32-bits.
- */
- x0 = _mm_load_si128((__m128i*)poly);
- x2 = _mm_and_si128(x1, x3);
- x2 = _mm_clmulepi64_si128(x2, x0, 0x10);
- x2 = _mm_and_si128(x2, x3);
- x2 = _mm_clmulepi64_si128(x2, x0, 0x00);
- x1 = _mm_xor_si128(x1, x2);
- crc = _mm_extract_epi32(x1, 1);
+
+ if (len >= 64) {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i*)(buf + 32)), y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i*)(buf + 48)), y3;
+ __m128i k;
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+ /* Main loop. */
+ while (len >= 64) {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i*)(buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i*)(buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i*)(buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+ if (len >= 16) {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
+ __m128i k;
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 16;
+ len -= 16;
+ /* Main loop. */
+ while (len >= 16) {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
+ buf += 16;
+ len -= 16;
}
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
- return pg_comp_crc32c_sse42_tail(crc, buf, len);
+ return pg_comp_crc32c_sse42_tail(crc0, buf, len);
}
--
2.48.1
Hi John,
I'm highly suspicious of these numbers because they show this version
is about 20x faster than "scalar", so relatively speaking 3x faster
than the AVX-512 proposal?
Apologies for this. I was suspicious of this too and looks like I had unintentionally set the scalar version I wrote for testing CRC32C correctness (which computes crc one byte at time). The code for microbenchmarking is all here: https://github.com/r-devulap/crc32c and this commit https://github.com/r-devulap/crc32c/commit/ca4d1b24fd8af87aab544fb1634523b6657325a0 fixes that. Rerunning the benchmarks gives me more sensible numbers:
Comparing scalar_crc32c to sse42_crc32c (from ./bench)
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------------------------
[scalar_crc32c vs. sse42_crc32c]/64 -0.0972 -0.0971 5 4 5 4
[scalar_crc32c vs. sse42_crc32c]/128 -0.3048 -0.3048 8 6 8 6
[scalar_crc32c vs. sse42_crc32c]/256 -0.4610 -0.4610 19 10 19 10
[scalar_crc32c vs. sse42_crc32c]/512 -0.6432 -0.6432 50 18 50 18
[scalar_crc32c vs. sse42_crc32c]/1024 -0.7192 -0.7192 121 34 121 34
[scalar_crc32c vs. sse42_crc32c]/2048 -0.7275 -0.7276 259 70 259 70
Luckily, that's easily fixable: It turns out the implementation
in the paper (and chromium) has a very inefficient finalization step,
using more carryless multiplications and plenty of other operations.
After the main loop, and at the end, it's much more efficient to
convert the 128-bit intermediate result directly into a CRC in the
usual way.
Thank you for pointing this out and also fixing it! This improves over the chromium version by 10-25% especially for with smaller byte size 64 - 512 bytes:
Comparing sse42_crc32c to corsix_crc32c (from ./bench)
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------------------------
[sse42_crc32c vs. corsix_crc32c]/64 -0.2696 -0.2696 4 3 4 3
[sse42_crc32c vs. corsix_crc32c]/128 -0.1551 -0.1552 6 5 6 5
[sse42_crc32c vs. corsix_crc32c]/256 -0.1787 -0.1787 10 8 10 8
[sse42_crc32c vs. corsix_crc32c]/512 -0.1351 -0.1351 18 15 18 15
[sse42_crc32c vs. corsix_crc32c]/1024 -0.0972 -0.0972 34 31 34 31
[sse42_crc32c vs. corsix_crc32c]/2048 -0.0763 -0.0763 69 64 69 64
OVERALL_GEOMEAN -0.1544 -0.1544 0 0 0 0
I generated a similar function for v2-0004 and this benchmark shows it's faster than master on 128 bytes and above.
I ran the same benchmark drive_crc32c with the postgres infrastructure and found that your v2 sse42 version from corsix is slower than pg_comp_crc32c_sse42 in master branch when buffer is < 128 bytes. I think the reason is that postgres is not using -O3 flag build the crc32c source files and the compiler generates less than optimal code. Adding that flag fixes the regression for buffers with 64 bytes - 128 bytes. Could you confirm that behavior on your end too?
---
src/port/pg_crc32c_sse42.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index a8c1e5609b..a350b1b93a 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -81,6 +81,7 @@ pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
pg_attribute_no_sanitize_alignment()
+__attribute__((optimize("-O3")))
pg_attribute_target("sse4.2,pclmul")
pg_crc32c
pg_comp_crc32c_sse42(pg_crc32c crc0, const void *data, size_t len)
--
You could also just build with export CFLAGS="-O3" instead of adding the function attribute.
I did the benchmarks on my older machine, which I believe has a latency of 7 cycles for this instruction.
May I ask which processor does you older machine have? I am benchmarking on a Tigerlake processor.
It's probably okay to fold these together in the same compile-time
check, since both are fairly old by now, but for those following
along, pclmul is not in SSE 4.2 and is a bit newer. So this would
cause machines building on Nehalem (2008) to fail the check and go
back to slicing-by-8 with it written this way.
Technically, the current version of the patch does not have a runtime cpuid check for pclmul and so would cause it to crash with segill on Nehalam (currently we only check for sse4.2). This needs to be fixed by adding an additional cpuid check for pcmul but it would fall back to slicing by 8 on Nehalem and use the latest version on Westmere and above. If you care about keeping the performance on Nehalem, then I am happy to update the choose function to pick the right pointer accordingly. Let me know which one you would prefer.
Raghuveer
From: Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com>
Sent: Wednesday, February 5, 2025 12:49 PM
To: pgsql-hackers@lists.postgresql.org
Cc: Shankaran, Akash <akash.shankaran@intel.com>; Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com>
Subject: Improve CRC32C performance on SSE4.2
This patch improves the performance of SSE42 CRC32C algorithm. The current SSE4.2 implementation of CRC32C relies on the native crc32 instruction and processes 8 bytes at a time in a loop. The technique in this paper uses the pclmulqdq instruction and processing 64 bytes at time. The algorithm is based on sse42 version of crc32 computation from Chromium's copy of zlib with modified constants for crc32c computation. See:
https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
Microbenchmarks (generated with google benchmark using a standalone version of the same algorithms):
Comparing scalar_crc32c to sse42_crc32c (for various buffer sizes: 64, 128, 256, 512, 1024, 2048 bytes)
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------------------------
[scalar_crc32c vs. sse42_crc32c]/64 -0.8147 -0.8148 33 6 33 6
[scalar_crc32c vs. sse42_crc32c]/128 -0.8962 -0.8962 88 9 88 9
[scalar_crc32c vs. sse42_crc32c]/256 -0.9200 -0.9200 211 17 211 17
[scalar_crc32c vs. sse42_crc32c]/512 -0.9389 -0.9389 486 30 486 30
[scalar_crc32c vs. sse42_crc32c]/1024 -0.9452 -0.9452 1037 57 1037 57
[scalar_crc32c vs. sse42_crc32c]/2048 -0.9456 -0.9456 2140 116 2140 116
Raghuveer
On Tue, Feb 11, 2025 at 7:25 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
I ran the same benchmark drive_crc32c with the postgres infrastructure and found that your v2 sse42 version from corsix is slower than pg_comp_crc32c_sse42 in master branch when buffer is < 128 bytes.
That matches my findings as well.
I think the reason is that postgres is not using -O3 flag build the crc32c source files and the compiler generates less than optimal code. Adding that flag fixes the regression for buffers with 64 bytes – 128 bytes. Could you confirm that behavior on your end too?
On my machine that still regresses compared to master in that range
(although by not as much) so I still think 128 bytes is the right
threshold.
The effect of -O3 with gcc14.2 is that the single-block loop (after
the 4-block loop) is unrolled. Unrolling adds branches and binary
space, so it'd be nice to avoid that even for systems that build with
-O3. I tried leaving out the single block loop, so that the tail is
called for the remaining <63 bytes, and it's actually better:
v2:
128
latency average = 10.256 ms
144
latency average = 11.393 ms
160
latency average = 12.977 ms
176
latency average = 14.364 ms
192
latency average = 12.627 ms
remove single-block loop:
128
latency average = 10.211 ms
144
latency average = 10.987 ms
160
latency average = 12.094 ms
176
latency average = 12.927 ms
192
latency average = 12.466 ms
Keeping the extra loop is probably better at this benchmark on newer
machines, but I don't think it's worth it. Microbenchmarks with fixed
sized input don't take branch mispredicts into account.
I did the benchmarks on my older machine, which I believe has a latency of 7 cycles for this instruction.
May I ask which processor does you older machine have? I am benchmarking on a Tigerlake processor.
It's an i7-7500U / Kaby Lake.
It's probably okay to fold these together in the same compile-time
check, since both are fairly old by now, but for those following
along, pclmul is not in SSE 4.2 and is a bit newer. So this would
cause machines building on Nehalem (2008) to fail the check and go
back to slicing-by-8 with it written this way.Technically, the current version of the patch does not have a runtime cpuid check for pclmul and so would cause it to crash with segill on Nehalam (currently we only check for sse4.2). This needs to be fixed by adding an additional cpuid check for pcmul but it would fall back to slicing by 8 on Nehalem and use the latest version on Westmere and above. If you care about keeping the performance on Nehalem, then I am happy to update the choose function to pick the right pointer accordingly. Let me know which one you would prefer.
Okay, Nehalem is 17 years old, and the additional cpuid check would
still work on hardware 14-15 years old, so I think it's fine to bump
the requirement for runtime hardware support.
--
John Naylor
Amazon Web Services
Hello,
Attached v3 which is same as v2 with the added PCLMULQDQ runtime CPUID check.
I ran the same benchmark drive_crc32c with the postgres infrastructure and
found that your v2 sse42 version from corsix is slower than
pg_comp_crc32c_sse42 in master branch when buffer is < 128 bytes.That matches my findings as well.
Never mind, I was building using the Makefile which doesn’t seem to add any optimization flag by default. I switched to using meson which uses -O2 and benchmarked using pgbench (using your script) and this behavior goes away on my TGL. Here is what I measure with your v2 (and v3):
| bytes | master (ms) | sse4.2-v2 (ms) | ratio |
| 64 | 9.627 | 6.306 | 1.52 |
| 80 | 10.976 | 6.662 | 1.64 |
| 96 | 12.411 | 8.212 | 1.51 |
| 112 | 13.871 | 9.403 | 1.47 |
| 128 | 15.283 | 7.724 | 1.97 |
| 144 | 16.715 | 9.173 | 1.82 |
| 160 | 18.18 | 11.292 | 1.60 |
| 176 | 19.847 | 12.606 | 1.57 |
| 192 | 22.043 | 10.16 | 2.16 |
| 208 | 24.261 | 11.699 | 2.07 |
| 224 | 26.63 | 13.607 | 1.95 |
| 240 | 28.994 | 14.721 | 1.96 |
| 256 | 31.418 | 13.132 | 2.39 |
On my machine that still regresses compared to master in that range (although by
not as much) so I still think 128 bytes is the right threshold.
On my TGL, buffer sizes as small as 64 bytes see performance benefits.
The effect of -O3 with gcc14.2 is that the single-block loop (after the 4-block loop)
is unrolled. Unrolling adds branches and binary space, so it'd be nice to avoid that
even for systems that build with -O3.
Agreed. My perf data shows -O2 is just as good.
Okay, Nehalem is 17 years old, and the additional cpuid check would still work on
hardware 14-15 years old, so I think it's fine to bump the requirement for runtime
hardware support.
Sounds good. I updated the runtime check to include PCLMULQDQ. New algorithm will run only on Westmere and newer CPU.
Raghuveer
Attachments:
v3-0001-Add-more-test-coverage-for-crc32c.patchapplication/octet-stream; name=v3-0001-Add-more-test-coverage-for-crc32c.patchDownload
From 834169842ca0dbadcfcffbf0cce9ab71b487005e Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 4 Feb 2025 12:56:00 -0800
Subject: [PATCH v3 1/4] Add more test coverage for crc32c
---
src/test/regress/expected/crc32c.out | 42 ++++++++++++++++++++++++++++
src/test/regress/parallel_schedule | 2 ++
src/test/regress/sql/crc32c.sql | 12 ++++++++
3 files changed, 56 insertions(+)
create mode 100644 src/test/regress/expected/crc32c.out
create mode 100644 src/test/regress/sql/crc32c.sql
diff --git a/src/test/regress/expected/crc32c.out b/src/test/regress/expected/crc32c.out
new file mode 100644
index 0000000000..f25965df4a
--- /dev/null
+++ b/src/test/regress/expected/crc32c.out
@@ -0,0 +1,42 @@
+--
+-- CRC32C
+-- Testing CRC32C SSE4.2 algorithm.
+-- The new algorithm has various code paths that needs test coverage.
+-- We achieve that by computing CRC32C of text of various sizes: 15, 64, 128, 144, 159 and 256 bytes.
+--
+SELECT crc32c('');
+ crc32c
+--------
+ 0
+(1 row)
+
+SELECT crc32c('Hello 15 bytes!');
+ crc32c
+------------
+ 3405757121
+(1 row)
+
+SELECT crc32c('This is a 64 byte piece of text to run through the main loop ...');
+ crc32c
+-----------
+ 721494841
+(1 row)
+
+SELECT crc32c('This is a carefully constructed text that needs to be exactly 128 bytes long for testing purposes. Let me add more words to ....');
+ crc32c
+------------
+ 1602016964
+(1 row)
+
+SELECT crc32c('This is a text that needs to be exactly 144 bytes long for testing purposes. I will add more words to reach that specific length. Now we are ...');
+ crc32c
+------------
+ 1912862944
+(1 row)
+
+SELECT crc32c('This is a precisely crafted message that needs to be exactly 159 bytes in length for testing purposes. I will continue adding more text until we reach that ...');
+ crc32c
+------------
+ 1245879782
+(1 row)
+
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index e63ee2cf2b..73a84e6def 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -56,6 +56,8 @@ test: create_aggregate create_function_sql create_cast constraints triggers sele
# ----------
test: sanity_check
+test: crc32c
+
# ----------
# Another group of parallel tests
# aggregates depends on create_aggregate
diff --git a/src/test/regress/sql/crc32c.sql b/src/test/regress/sql/crc32c.sql
new file mode 100644
index 0000000000..5e481eab6f
--- /dev/null
+++ b/src/test/regress/sql/crc32c.sql
@@ -0,0 +1,12 @@
+--
+-- CRC32C
+-- Testing CRC32C SSE4.2 algorithm.
+-- The new algorithm has various code paths that needs test coverage.
+-- We achieve that by computing CRC32C of text of various sizes: 15, 64, 128, 144, 159 and 256 bytes.
+--
+SELECT crc32c('');
+SELECT crc32c('Hello 15 bytes!');
+SELECT crc32c('This is a 64 byte piece of text to run through the main loop ...');
+SELECT crc32c('This is a carefully constructed text that needs to be exactly 128 bytes long for testing purposes. Let me add more words to ....');
+SELECT crc32c('This is a text that needs to be exactly 144 bytes long for testing purposes. I will add more words to reach that specific length. Now we are ...');
+SELECT crc32c('This is a precisely crafted message that needs to be exactly 159 bytes in length for testing purposes. I will continue adding more text until we reach that ...');
--
2.43.0
v3-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchapplication/octet-stream; name=v3-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From 8afcb629c47ff97bd38e48d2a75cac36aaa3e603 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v3 2/4] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67..06673db062 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 0000000000..5b747c6184
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 0000000000..dff6bb3133
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 0000000000..d7bec4ba1c
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 0000000000..95c6dfe448
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 0000000000..52b9772f90
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 0000000000..b350caf5ce
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT64(0);
+ int64 num = PG_GETARG_INT64(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 0000000000..878a077ee1
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.43.0
v3-0003-Improve-CRC32C-performance-on-SSE4.2.patchapplication/octet-stream; name=v3-0003-Improve-CRC32C-performance-on-SSE4.2.patchDownload
From 8bf013c5d1bf52eab2c877a443324d3e46d18052 Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 4 Feb 2025 15:20:13 -0800
Subject: [PATCH v3 3/4] Improve CRC32C performance on SSE4.2
The current SSE4.2 implementation of CRC32C relies on the native crc32
instruction and processes 8 bytes at a time in a loop. The technique in
this paper uses the pclmulqdq instruction and processing 64 bytes at
time.
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
V. Gopal, E. Ozturk, et al., 2009
The algorithm is based on crc32_sse42_simd from chromimum's copy of zlib. See:
from https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
Microbenchmarks generated from standalone benchmarks:
see https://github.com/r-devulap/crc32c
Comparing scalar_crc32c (master) to sse42_crc32c (this branch):
|----------------------------------+------------------+---------------+---------------|
| Benchmark | buf size (bytes) | Time Old (ns) | Time New (ns) |
|----------------------------------+------------------+---------------+---------------|
| [scalar_crc32c vs. sse42_crc32c] | 64 | 5 | 4 |
| [scalar_crc32c vs. sse42_crc32c] | 128 | 8 | 6 |
| [scalar_crc32c vs. sse42_crc32c] | 256 | 19 | 10 |
| [scalar_crc32c vs. sse42_crc32c] | 512 | 50 | 18 |
| [scalar_crc32c vs. sse42_crc32c] | 1024 | 121 | 34 |
| [scalar_crc32c vs. sse42_crc32c] | 2048 | 259 | 70 |
|----------------------------------+------------------+---------------+---------------|
---
config/c-compiler.m4 | 7 +-
configure | 7 +-
meson.build | 7 +-
src/port/pg_crc32c_sse42.c | 127 +++++++++++++++++++++++++++++-
src/port/pg_crc32c_sse42_choose.c | 8 +-
5 files changed, 146 insertions(+), 10 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 8534cc54c1..8b255b5cc8 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -557,14 +557,19 @@ AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}],
diff --git a/configure b/configure
index 0ffcaeb436..3f2a2a515e 100755
--- a/configure
+++ b/configure
@@ -17059,14 +17059,19 @@ else
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00);
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/meson.build b/meson.build
index 1ceadb9a83..456c3fafc3 100644
--- a/meson.build
+++ b/meson.build
@@ -2227,15 +2227,18 @@ if host_cpu == 'x86' or host_cpu == 'x86_64'
prog = '''
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
-__attribute__((target("sse4.2")))
+__attribute__((target("sse4.2,pclmul")))
#endif
int main(void)
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..69f8917c7d 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,13 +15,13 @@
#include "c.h"
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
-pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
+static pg_crc32c
+pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
{
const unsigned char *p = data;
const unsigned char *pend = p + len;
@@ -68,3 +68,124 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+/*
+ * Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
+ * Instruction" V. Gopal, E. Ozturk, et al., 2009
+ *
+ * The algorithm is based on crc32_sse42_simd from chromimum's copy of zlib.
+ * See:
+ * https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
+ */
+
+pg_attribute_no_sanitize_alignment()
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
+{
+ ssize_t len = (ssize_t) length;
+ const unsigned char *buf = data;
+ /*
+ * Definitions of the bit-reflected domain constants k1,k2,k3, etc and
+ * the CRC32+Barrett polynomials given at the end of the paper.
+ */
+ static const uint64_t pg_attribute_aligned(16) k1k2[] = { 0x740eef02, 0x9e4addf8 };
+ static const uint64_t pg_attribute_aligned(16) k3k4[] = { 0xf20c0dfe, 0x14cd00bd6 };
+ static const uint64_t pg_attribute_aligned(16) k5k0[] = { 0xdd45aab8, 0x000000000 };
+ static const uint64_t pg_attribute_aligned(16) poly[] = { 0x105ec76f1, 0xdea713f1 };
+ if (len >= 64) {
+ __m128i x0, x1, x2, x3, x4, x5, x6, x7, x8, y5, y6, y7, y8;
+ /*
+ * There's at least one block of 64.
+ */
+ x1 = _mm_loadu_si128((__m128i *)(buf + 0x00));
+ x2 = _mm_loadu_si128((__m128i *)(buf + 0x10));
+ x3 = _mm_loadu_si128((__m128i *)(buf + 0x20));
+ x4 = _mm_loadu_si128((__m128i *)(buf + 0x30));
+ x1 = _mm_xor_si128(x1, _mm_cvtsi32_si128(crc));
+ x0 = _mm_load_si128((__m128i *)k1k2);
+ buf += 64;
+ len -= 64;
+ /*
+ * Parallel fold blocks of 64, if any.
+ */
+ while (len >= 64)
+ {
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x6 = _mm_clmulepi64_si128(x2, x0, 0x00);
+ x7 = _mm_clmulepi64_si128(x3, x0, 0x00);
+ x8 = _mm_clmulepi64_si128(x4, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x11);
+ x3 = _mm_clmulepi64_si128(x3, x0, 0x11);
+ x4 = _mm_clmulepi64_si128(x4, x0, 0x11);
+ y5 = _mm_loadu_si128((__m128i *)(buf + 0x00));
+ y6 = _mm_loadu_si128((__m128i *)(buf + 0x10));
+ y7 = _mm_loadu_si128((__m128i *)(buf + 0x20));
+ y8 = _mm_loadu_si128((__m128i *)(buf + 0x30));
+ x1 = _mm_xor_si128(x1, x5);
+ x2 = _mm_xor_si128(x2, x6);
+ x3 = _mm_xor_si128(x3, x7);
+ x4 = _mm_xor_si128(x4, x8);
+ x1 = _mm_xor_si128(x1, y5);
+ x2 = _mm_xor_si128(x2, y6);
+ x3 = _mm_xor_si128(x3, y7);
+ x4 = _mm_xor_si128(x4, y8);
+ buf += 64;
+ len -= 64;
+ }
+ /*
+ * Fold into 128-bits.
+ */
+ x0 = _mm_load_si128((__m128i *)k3k4);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x2);
+ x1 = _mm_xor_si128(x1, x5);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x3);
+ x1 = _mm_xor_si128(x1, x5);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x4);
+ x1 = _mm_xor_si128(x1, x5);
+ /*
+ * Single fold blocks of 16, if any.
+ */
+ while (len >= 16)
+ {
+ x2 = _mm_loadu_si128((__m128i *)buf);
+ x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
+ x1 = _mm_xor_si128(x1, x2);
+ x1 = _mm_xor_si128(x1, x5);
+ buf += 16;
+ len -= 16;
+ }
+ /*
+ * Fold 128-bits to 64-bits.
+ */
+ x2 = _mm_clmulepi64_si128(x1, x0, 0x10);
+ x3 = _mm_setr_epi32(~0, 0, ~0, 0);
+ x1 = _mm_srli_si128(x1, 8);
+ x1 = _mm_xor_si128(x1, x2);
+ x0 = _mm_loadl_epi64((__m128i*)k5k0);
+ x2 = _mm_srli_si128(x1, 4);
+ x1 = _mm_and_si128(x1, x3);
+ x1 = _mm_clmulepi64_si128(x1, x0, 0x00);
+ x1 = _mm_xor_si128(x1, x2);
+ /*
+ * Barret reduce to 32-bits.
+ */
+ x0 = _mm_load_si128((__m128i*)poly);
+ x2 = _mm_and_si128(x1, x3);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x10);
+ x2 = _mm_and_si128(x2, x3);
+ x2 = _mm_clmulepi64_si128(x2, x0, 0x00);
+ x1 = _mm_xor_si128(x1, x2);
+ crc = _mm_extract_epi32(x1, 1);
+ }
+
+ return pg_comp_crc32c_sse42_tail(crc, buf, len);
+}
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d424..57e4ad5dfd 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -31,7 +31,7 @@
#include "port/pg_crc32c.h"
static bool
-pg_crc32c_sse42_available(void)
+pg_crc32c_sse42_pclmul_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -43,7 +43,9 @@ pg_crc32c_sse42_available(void)
#error cpuid instruction not available
#endif
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
+ bool sse42 = (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
+ bool pclmul = (exx[2] & (1 << 1)) != 0; /* PCLMULQDQ */
+ return sse42 && pclmul;
}
/*
@@ -53,7 +55,7 @@ pg_crc32c_sse42_available(void)
static pg_crc32c
pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
- if (pg_crc32c_sse42_available())
+ if (pg_crc32c_sse42_pclmul_available())
pg_comp_crc32c = pg_comp_crc32c_sse42;
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
--
2.43.0
v3-0004-Shorter-version-from-corsix.patchapplication/octet-stream; name=v3-0004-Shorter-version-from-corsix.patchDownload
From f8d60abb5851ea713561da1ccababc8f94206ee3 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Sun, 9 Feb 2025 12:25:56 +0700
Subject: [PATCH v3 4/4] Shorter version from corsix
---
src/port/pg_crc32c_sse42.c | 165 ++++++++++++++-----------------------
1 file changed, 62 insertions(+), 103 deletions(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 69f8917c7d..dec685d139 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -78,114 +78,73 @@ pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
* https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/zlib/crc32_simd.c
*/
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2,pclmul")
pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
+pg_comp_crc32c_sse42(pg_crc32c crc0, const void *data, size_t length)
{
- ssize_t len = (ssize_t) length;
+ size_t len = length;
const unsigned char *buf = data;
- /*
- * Definitions of the bit-reflected domain constants k1,k2,k3, etc and
- * the CRC32+Barrett polynomials given at the end of the paper.
- */
- static const uint64_t pg_attribute_aligned(16) k1k2[] = { 0x740eef02, 0x9e4addf8 };
- static const uint64_t pg_attribute_aligned(16) k3k4[] = { 0xf20c0dfe, 0x14cd00bd6 };
- static const uint64_t pg_attribute_aligned(16) k5k0[] = { 0xdd45aab8, 0x000000000 };
- static const uint64_t pg_attribute_aligned(16) poly[] = { 0x105ec76f1, 0xdea713f1 };
- if (len >= 64) {
- __m128i x0, x1, x2, x3, x4, x5, x6, x7, x8, y5, y6, y7, y8;
- /*
- * There's at least one block of 64.
- */
- x1 = _mm_loadu_si128((__m128i *)(buf + 0x00));
- x2 = _mm_loadu_si128((__m128i *)(buf + 0x10));
- x3 = _mm_loadu_si128((__m128i *)(buf + 0x20));
- x4 = _mm_loadu_si128((__m128i *)(buf + 0x30));
- x1 = _mm_xor_si128(x1, _mm_cvtsi32_si128(crc));
- x0 = _mm_load_si128((__m128i *)k1k2);
- buf += 64;
- len -= 64;
- /*
- * Parallel fold blocks of 64, if any.
- */
- while (len >= 64)
- {
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x6 = _mm_clmulepi64_si128(x2, x0, 0x00);
- x7 = _mm_clmulepi64_si128(x3, x0, 0x00);
- x8 = _mm_clmulepi64_si128(x4, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x2 = _mm_clmulepi64_si128(x2, x0, 0x11);
- x3 = _mm_clmulepi64_si128(x3, x0, 0x11);
- x4 = _mm_clmulepi64_si128(x4, x0, 0x11);
- y5 = _mm_loadu_si128((__m128i *)(buf + 0x00));
- y6 = _mm_loadu_si128((__m128i *)(buf + 0x10));
- y7 = _mm_loadu_si128((__m128i *)(buf + 0x20));
- y8 = _mm_loadu_si128((__m128i *)(buf + 0x30));
- x1 = _mm_xor_si128(x1, x5);
- x2 = _mm_xor_si128(x2, x6);
- x3 = _mm_xor_si128(x3, x7);
- x4 = _mm_xor_si128(x4, x8);
- x1 = _mm_xor_si128(x1, y5);
- x2 = _mm_xor_si128(x2, y6);
- x3 = _mm_xor_si128(x3, y7);
- x4 = _mm_xor_si128(x4, y8);
- buf += 64;
- len -= 64;
- }
- /*
- * Fold into 128-bits.
- */
- x0 = _mm_load_si128((__m128i *)k3k4);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x2);
- x1 = _mm_xor_si128(x1, x5);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x3);
- x1 = _mm_xor_si128(x1, x5);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x4);
- x1 = _mm_xor_si128(x1, x5);
- /*
- * Single fold blocks of 16, if any.
- */
- while (len >= 16)
- {
- x2 = _mm_loadu_si128((__m128i *)buf);
- x5 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x11);
- x1 = _mm_xor_si128(x1, x2);
- x1 = _mm_xor_si128(x1, x5);
- buf += 16;
- len -= 16;
- }
- /*
- * Fold 128-bits to 64-bits.
- */
- x2 = _mm_clmulepi64_si128(x1, x0, 0x10);
- x3 = _mm_setr_epi32(~0, 0, ~0, 0);
- x1 = _mm_srli_si128(x1, 8);
- x1 = _mm_xor_si128(x1, x2);
- x0 = _mm_loadl_epi64((__m128i*)k5k0);
- x2 = _mm_srli_si128(x1, 4);
- x1 = _mm_and_si128(x1, x3);
- x1 = _mm_clmulepi64_si128(x1, x0, 0x00);
- x1 = _mm_xor_si128(x1, x2);
- /*
- * Barret reduce to 32-bits.
- */
- x0 = _mm_load_si128((__m128i*)poly);
- x2 = _mm_and_si128(x1, x3);
- x2 = _mm_clmulepi64_si128(x2, x0, 0x10);
- x2 = _mm_and_si128(x2, x3);
- x2 = _mm_clmulepi64_si128(x2, x0, 0x00);
- x1 = _mm_xor_si128(x1, x2);
- crc = _mm_extract_epi32(x1, 1);
+
+ if (len >= 64) {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i*)(buf + 32)), y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i*)(buf + 48)), y3;
+ __m128i k;
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+ /* Main loop. */
+ while (len >= 64) {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i*)(buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i*)(buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i*)(buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+ if (len >= 16) {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
+ __m128i k;
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 16;
+ len -= 16;
+ /* Main loop. */
+ while (len >= 16) {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
+ buf += 16;
+ len -= 16;
}
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
- return pg_comp_crc32c_sse42_tail(crc, buf, len);
+ return pg_comp_crc32c_sse42_tail(crc0, buf, len);
}
--
2.43.0
On Wed, Feb 12, 2025 at 4:34 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
On my machine that still regresses compared to master in that range (although by
not as much) so I still think 128 bytes is the right threshold.On my TGL, buffer sizes as small as 64 bytes see performance benefits.
Yeah, I'm guessing it's because newer chips will have better IPC. We
need to take care not to regress for hardware that's only 5-10 years
old.
Attached is v4:
0001: Same perf test, just in case
0002-4: I redid adding the implementation (without the single-block
loop and with 128-byte threshold), but split out the steps for
reference, as a model for possible ARM support in the future. These
will all be squashed on commit. The upstream code has very long lines,
even after running pgindent, so some may find that objectionable. We
could easily turn some commas into semicolons, and then it'll wrap
more nicely. I just wanted to change as little as possible for now. (I
also need to check if I need to put more license text here..)
0005: This has a fleshed-out draft commit message, but otherwise is
just the same configure/choose support as v3.
Some review comments:
1. Some of the comments that only mention SSE 4.2 in the compile- and
run-time checks need to be updated.
Okay, Nehalem is 17 years old, and the additional cpuid check would still work on
hardware 14-15 years old, so I think it's fine to bump the requirement for runtime
hardware support.Sounds good. I updated the runtime check to include PCLMULQDQ. New algorithm will run only on Westmere and newer CPU.
2. Unfortunately, there is another wrinkle that I failed to consider: If you
search the web for "VirtualBox pclmulqdq" you can see a few reports from not
very long ago that some hypervisors don't enable the CPUID for pclmul. I
don't know how big a problem that is in practice today, but it seems we
should actually have separate checks, with fallback. Sorry I didn't
think of this earlier.
3. Note: I left out the new test file from v3-0001. We should have
tests, but note we already have some CRC tests in
src/test/regress/sql/strings.sql -- let's put new ones there. Also,
for the longer strings we want to test, it's easier to read/verify to
use something like
SELECT crc32c(repeat('A', 128)::bytea);
Maybe it's sufficient to have 127, 128, 129 for lengths, and maybe a
couple more.
--
John Naylor
Amazon Web Services
Attachments:
v4-0002-Vendor-SSE-implementation-from-https-github.com-c.patchtext/x-patch; charset=US-ASCII; name=v4-0002-Vendor-SSE-implementation-from-https-github.com-c.patchDownload
From 57952d1f89f0c3a4a2d28399344e9335f8bee72b Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v4 2/5] Vendor SSE implementation from
https://github.com/corsix/fast-crc32/
---
src/port/pg_crc32c_sse42.c | 77 ++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..6cc39de175 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -68,3 +68,80 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4 */
+/* MIT licensed */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <nmmintrin.h>
+#include <wmmintrin.h>
+
+#if defined(_MSC_VER)
+#define CRC_AINLINE static __forceinline
+#define CRC_ALIGN(n) __declspec(align(n))
+#else
+#define CRC_AINLINE static __inline __attribute__((always_inline))
+#define CRC_ALIGN(n) __attribute__((aligned(n)))
+#endif
+#define CRC_EXPORT extern
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+CRC_EXPORT uint32_t crc32_impl(uint32_t crc0, const char* buf, size_t len) {
+ crc0 = ~crc0;
+ for (; len && ((uintptr_t)buf & 7); --len) {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t)buf & 8) && len >= 8) {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
+ buf += 8;
+ len -= 8;
+ }
+ if (len >= 64) {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i*)(buf + 32)), y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i*)(buf + 48)), y3;
+ __m128i k;
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+ /* Main loop. */
+ while (len >= 64) {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i*)(buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i*)(buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i*)(buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+ for (; len >= 8; buf += 8, len -= 8) {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
+ }
+ for (; len; --len) {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ return ~crc0;
+}
--
2.48.1
v4-0005-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v4-0005-Improve-CRC32C-performance-on-x86_64.patchDownload
From acb63cddd8c8220db97ae0b012bf4f2fb5174e8a Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 17:07:49 +0700
Subject: [PATCH v4 5/5] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
config/c-compiler.m4 | 7 ++++++-
configure | 7 ++++++-
meson.build | 7 +++++--
src/port/pg_crc32c_sse42.c | 4 ++++
src/port/pg_crc32c_sse42_choose.c | 9 ++++++---
5 files changed, 27 insertions(+), 7 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 8534cc54c1..8b255b5cc8 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -557,14 +557,19 @@ AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}],
diff --git a/configure b/configure
index 0ffcaeb436..3f2a2a515e 100755
--- a/configure
+++ b/configure
@@ -17059,14 +17059,19 @@ else
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00);
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/meson.build b/meson.build
index 1ceadb9a83..456c3fafc3 100644
--- a/meson.build
+++ b/meson.build
@@ -2227,15 +2227,18 @@ if host_cpu == 'x86' or host_cpu == 'x86_64'
prog = '''
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
-__attribute__((target("sse4.2")))
+__attribute__((target("sse4.2,pclmul")))
#endif
int main(void)
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 7250eccf6b..05b11b47cb 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -3,6 +3,10 @@
* pg_crc32c_sse42.c
* Compute CRC-32C checksum using Intel SSE 4.2 instructions.
*
+ * For longer inputs, we use carryless multiplication on SIMD registers,
+ * based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
+ * Instruction" V. Gopal, E. Ozturk, et al., 2009
+ *
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d424..95cfe63493 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -31,7 +31,7 @@
#include "port/pg_crc32c.h"
static bool
-pg_crc32c_sse42_available(void)
+pg_crc32c_sse42_pclmul_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -43,7 +43,10 @@ pg_crc32c_sse42_available(void)
#error cpuid instruction not available
#endif
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
+ bool sse42 = (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
+ bool pclmul = (exx[2] & (1 << 1)) != 0; /* PCLMULQDQ */
+
+ return sse42 && pclmul;
}
/*
@@ -53,7 +56,7 @@ pg_crc32c_sse42_available(void)
static pg_crc32c
pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
- if (pg_crc32c_sse42_available())
+ if (pg_crc32c_sse42_pclmul_available())
pg_comp_crc32c = pg_comp_crc32c_sse42;
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
--
2.48.1
v4-0004-Run-pgindent-XXX-Some-lines-are-still-really-long.patchtext/x-patch; charset=US-ASCII; name=v4-0004-Run-pgindent-XXX-Some-lines-are-still-really-long.patchDownload
From a09e918bab5b6aac134c28bebd4b6f60ed05bfc9 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 16:03:52 +0700
Subject: [PATCH v4 4/5] Run pgindent XXX Some lines are still really long
---
src/port/pg_crc32c_sse42.c | 95 +++++++++++++++++++++-----------------
1 file changed, 53 insertions(+), 42 deletions(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 3395617301..7250eccf6b 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -79,49 +79,60 @@ pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
pg_attribute_target("sse4.2,pclmul")
pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length) {
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
+{
/* adjust names to match generated code */
- pg_crc32c crc0 = crc;
- size_t len = length;
+ pg_crc32c crc0 = crc;
+ size_t len = length;
const unsigned char *buf = data;
- if (len >= 128) {
- /* First vector chunk. */
- __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
- __m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
- __m128i x2 = _mm_loadu_si128((const __m128i*)(buf + 32)), y2;
- __m128i x3 = _mm_loadu_si128((const __m128i*)(buf + 48)), y3;
- __m128i k;
- k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
- x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
- buf += 64;
- len -= 64;
- /* Main loop. */
- while (len >= 64) {
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
- y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
- y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
- y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
- y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i*)(buf + 16))), x1 = _mm_xor_si128(x1, y1);
- y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i*)(buf + 32))), x2 = _mm_xor_si128(x2, y2);
- y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i*)(buf + 48))), x3 = _mm_xor_si128(x3, y3);
- buf += 64;
- len -= 64;
- }
- /* Reduce x0 ... x3 to just x0. */
- k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
- y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
- y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
- k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
- /* Reduce 128 bits to 32 bits, and multiply by x^32. */
- crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
- crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
- }
-
- return pg_comp_crc32c_sse42_tail(crc0, buf, len);
+ if (len >= 128)
+ {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+
+ /* Main loop. */
+ while (len >= 64)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+
+ return pg_comp_crc32c_sse42_tail(crc0, buf, len);
}
--
2.48.1
v4-0003-Adjust-previous-commit-to-match-our-style-add-128.patchtext/x-patch; charset=US-ASCII; name=v4-0003-Adjust-previous-commit-to-match-our-style-add-128.patchDownload
From 543752f816e3f9f0e312dac2be14fabb7c56101e Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:27 +0700
Subject: [PATCH v4 3/5] Adjust previous commit to match our style, add
128-byte threshold
---
src/port/pg_crc32c_sse42.c | 48 +++++++++++---------------------------
1 file changed, 14 insertions(+), 34 deletions(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 6cc39de175..3395617301 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,13 +15,14 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
-pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
+static pg_crc32c
+pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
{
const unsigned char *p = data;
const unsigned char *pend = p + len;
@@ -73,34 +74,18 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
/* ./generate -i sse -p crc32c -a v4 */
/* MIT licensed */
-#include <stddef.h>
-#include <stdint.h>
-#include <nmmintrin.h>
-#include <wmmintrin.h>
-
-#if defined(_MSC_VER)
-#define CRC_AINLINE static __forceinline
-#define CRC_ALIGN(n) __declspec(align(n))
-#else
-#define CRC_AINLINE static __inline __attribute__((always_inline))
-#define CRC_ALIGN(n) __attribute__((aligned(n)))
-#endif
-#define CRC_EXPORT extern
-
#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
-CRC_EXPORT uint32_t crc32_impl(uint32_t crc0, const char* buf, size_t len) {
- crc0 = ~crc0;
- for (; len && ((uintptr_t)buf & 7); --len) {
- crc0 = _mm_crc32_u8(crc0, *buf++);
- }
- if (((uintptr_t)buf & 8) && len >= 8) {
- crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
- buf += 8;
- len -= 8;
- }
- if (len >= 64) {
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length) {
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const unsigned char *buf = data;
+
+ if (len >= 128) {
/* First vector chunk. */
__m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
__m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
@@ -137,11 +122,6 @@ CRC_EXPORT uint32_t crc32_impl(uint32_t crc0, const char* buf, size_t len) {
crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
}
- for (; len >= 8; buf += 8, len -= 8) {
- crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
- }
- for (; len; --len) {
- crc0 = _mm_crc32_u8(crc0, *buf++);
- }
- return ~crc0;
+
+ return pg_comp_crc32c_sse42_tail(crc0, buf, len);
}
--
2.48.1
v4-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From 3a27b748ec17feff4547d7ab2689d80ba6d55665 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v4 1/5] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67..06673db062 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 0000000000..5b747c6184
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 0000000000..dff6bb3133
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 0000000000..d7bec4ba1c
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 0000000000..95c6dfe448
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 0000000000..52b9772f90
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 0000000000..b350caf5ce
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT64(0);
+ int64 num = PG_GETARG_INT64(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 0000000000..878a077ee1
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
Hi,
2. Unfortunately, there is another wrinkle that I failed to consider: If you search
the web for "VirtualBox pclmulqdq" you can see a few reports from not very long
ago that some hypervisors don't enable the CPUID for pclmul. I don't know how
big a problem that is in practice today, but it seems we should actually have
separate checks, with fallback. Sorry I didn't think of this earlier.
If someone using a VM that doesn't support a 15 yr old feature, then I would argue performance is not
the top priority for them. But that’s up to you. I will work on adding it unless you change your mind.
Also, do we really need to have both USE_SSE42_CRC32C and USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
features support? The former macro is used to enable running the SSE42 version without a runtime check
when someone builds with -msse4.2. The code looks fine now, but the runtime dispatch rules get complicated
as we add the PCLMUL and AVX512 dispatch in the future. IMO, this additional complexity is not worth it.
The cpuid runtime dispatch runs just once when postgres server is first setup and would hardly affect performance.
Let me know what you think.
Raghuveer
On Wed, Feb 12, 2025 at 09:02:27PM +0000, Devulapalli, Raghuveer wrote:
Also, do we really need to have both USE_SSE42_CRC32C and USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
features support? The former macro is used to enable running the SSE42 version without a runtime check
when someone builds with -msse4.2. The code looks fine now, but the runtime dispatch rules get complicated
as we add the PCLMUL and AVX512 dispatch in the future. IMO, this additional complexity is not worth it.
The cpuid runtime dispatch runs just once when postgres server is first setup and would hardly affect performance.
Let me know what you think.
I think the idea behind USE_SSE42_CRC32C is to avoid the function pointer
overhead if possible. I looked at switching to always using runtime checks
for this stuff, and we concluded that we'd better not [0]/messages/by-id/flat/20231030161706.GA3011@nathanxps13.
[0]: /messages/by-id/flat/20231030161706.GA3011@nathanxps13
--
nathan
I think the idea behind USE_SSE42_CRC32C is to avoid the function pointer
overhead if possible. I looked at switching to always using runtime checks for this
stuff, and we concluded that we'd better not [0].
Does that mean we want this feature for the new PCLMUL (and AVX-512) crc32c implementations too? The code for that will look a little ugly, I might need to think about a cleaner way to do this.
Raghuveer
On Wed, Feb 12, 2025 at 09:48:57PM +0000, Devulapalli, Raghuveer wrote:
I think the idea behind USE_SSE42_CRC32C is to avoid the function pointer
overhead if possible. I looked at switching to always using runtime checks for this
stuff, and we concluded that we'd better not [0].Does that mean we want this feature for the new PCLMUL (and AVX-512) crc32c implementations too? The code for that will look a little ugly, I might need to think about a cleaner way to do this.
Well, I suspect the AVX-512 version will pretty much always need the
runtime check given that its not available on a lot of newer hardware and
requires a bunch of extra runtime checks (see pg_popcount_avx512.c). But
it might be worth doing for PCLMUL. Otherwise, I think we'd have to leave
out the PCLMUL optimizations if built with -msse4.2 -mpclmul because we
don't want to regress existing -msse4.2 users with a runtime check.
--
nathan
Well, I suspect the AVX-512 version will pretty much always need the runtime
check given that its not available on a lot of newer hardware and requires a
bunch of extra runtime checks (see pg_popcount_avx512.c). But it might be
worth doing for PCLMUL. Otherwise, I think we'd have to leave out the PCLMUL
optimizations if built with -msse4.2 -mpclmul because we don't want to regress
existing -msse4.2 users with a runtime check.
Sounds good to me. Although, users building with just -msse4.2 will now encounter an
an additional pclmul runtime check. That would be a regression unless they update to
building with both -msse4.2 and -mpclmul.
Raghuveer
On Wed, Feb 12, 2025 at 10:12:20PM +0000, Devulapalli, Raghuveer wrote:
Well, I suspect the AVX-512 version will pretty much always need the runtime
check given that its not available on a lot of newer hardware and requires a
bunch of extra runtime checks (see pg_popcount_avx512.c). But it might be
worth doing for PCLMUL. Otherwise, I think we'd have to leave out the PCLMUL
optimizations if built with -msse4.2 -mpclmul because we don't want to regress
existing -msse4.2 users with a runtime check.Sounds good to me. Although, users building with just -msse4.2 will now encounter an
an additional pclmul runtime check. That would be a regression unless they update to
building with both -msse4.2 and -mpclmul.
My thinking was that building with just -msse4.2 would cause the existing
SSE 4.2 implementation to be used (without the function pointer). That's
admittedly a bit goofy because they'd miss out on the PCLMUL optimization,
but things at least don't get any worse for them.
--
nathan
Sounds good to me. Although, users building with just -msse4.2 will
now encounter an an additional pclmul runtime check. That would be a
regression unless they update to building with both -msse4.2 and -mpclmul.My thinking was that building with just -msse4.2 would cause the existing SSE 4.2
implementation to be used (without the function pointer). That's admittedly a bit
goofy because they'd miss out on the PCLMUL optimization, but things at least
don't get any worse for them.
Right. We are only talking about a regression for small potion of people who build
with -msse4.2 and run on Nehalem/VM with pclmul disabled where we will run
the cpuid check for pclmul and still pick the sse4.2 version.
Raghuveer
On Thu, Feb 13, 2025 at 4:18 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
I think the idea behind USE_SSE42_CRC32C is to avoid the function pointer
overhead if possible. I looked at switching to always using runtime checks
for this stuff, and we concluded that we'd better not [0].
For short lengths, I tried unrolling the loop into a switch statement,
as in the attached v5-0006 (the other new patches are fixes for CI).
That usually looks faster for me, but not on the length used under the
WAL insert lock. Usual caveat: Using small fixed-sized lengths in
benchmarks can be misleading, because branches are more easily
predicted.
It seems like for always using runtime checks we'd need to use
branching, rather than function pointers, as has been proposed
elsewhere.
master:
20
latency average = 3.622 ms
latency average = 3.573 ms
latency average = 3.599 ms
64
latency average = 7.791 ms
latency average = 7.920 ms
latency average = 7.888 ms
80
latency average = 8.076 ms
latency average = 8.140 ms
latency average = 8.150 ms
96
latency average = 8.853 ms
latency average = 8.897 ms
latency average = 8.914 ms
112
latency average = 9.867 ms
latency average = 9.825 ms
latency average = 9.869 ms
v5:
20
latency average = 4.550 ms
latency average = 4.327 ms
latency average = 4.320 ms
64
latency average = 5.064 ms
latency average = 4.934 ms
latency average = 5.020 ms
80
latency average = 4.904 ms
latency average = 4.786 ms
latency average = 4.942 ms
96
latency average = 5.392 ms
latency average = 5.376 ms
latency average = 5.367 ms
112
latency average = 5.730 ms
latency average = 5.859 ms
latency average = 5.734 ms
--
John Naylor
Amazon Web Services
Attachments:
v5-0005-Improve-CRC32C-performance-on-x86_64.patchapplication/x-patch; name=v5-0005-Improve-CRC32C-performance-on-x86_64.patchDownload
From acb63cddd8c8220db97ae0b012bf4f2fb5174e8a Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 17:07:49 +0700
Subject: [PATCH v5 5/8] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
config/c-compiler.m4 | 7 ++++++-
configure | 7 ++++++-
meson.build | 7 +++++--
src/port/pg_crc32c_sse42.c | 4 ++++
src/port/pg_crc32c_sse42_choose.c | 9 ++++++---
5 files changed, 27 insertions(+), 7 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 8534cc54c1..8b255b5cc8 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -557,14 +557,19 @@ AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
[define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics])])dnl
AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32], [Ac_cachevar],
[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}],
diff --git a/configure b/configure
index 0ffcaeb436..3f2a2a515e 100755
--- a/configure
+++ b/configure
@@ -17059,14 +17059,19 @@ else
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
#include <nmmintrin.h>
+ #include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
- __attribute__((target("sse4.2")))
+ __attribute__((target("sse4.2,pclmul")))
#endif
static int crc32_sse42_test(void)
+
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00);
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/meson.build b/meson.build
index 1ceadb9a83..456c3fafc3 100644
--- a/meson.build
+++ b/meson.build
@@ -2227,15 +2227,18 @@ if host_cpu == 'x86' or host_cpu == 'x86_64'
prog = '''
#include <nmmintrin.h>
-
+#include <wmmintrin.h>
#if defined(__has_attribute) && __has_attribute (target)
-__attribute__((target("sse4.2")))
+__attribute__((target("sse4.2,pclmul")))
#endif
int main(void)
{
+ __m128i x1 = _mm_set1_epi32(1);
unsigned int crc = 0;
crc = _mm_crc32_u8(crc, 0);
crc = _mm_crc32_u32(crc, 0);
+ x1 = _mm_clmulepi64_si128(x1, x1, 0x00); // pclmul
+ crc = crc + _mm_extract_epi32(x1, 1);
/* return computed value, to prevent the above being optimized away */
return crc == 0;
}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 7250eccf6b..05b11b47cb 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -3,6 +3,10 @@
* pg_crc32c_sse42.c
* Compute CRC-32C checksum using Intel SSE 4.2 instructions.
*
+ * For longer inputs, we use carryless multiplication on SIMD registers,
+ * based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
+ * Instruction" V. Gopal, E. Ozturk, et al., 2009
+ *
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d424..95cfe63493 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -31,7 +31,7 @@
#include "port/pg_crc32c.h"
static bool
-pg_crc32c_sse42_available(void)
+pg_crc32c_sse42_pclmul_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -43,7 +43,10 @@ pg_crc32c_sse42_available(void)
#error cpuid instruction not available
#endif
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
+ bool sse42 = (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
+ bool pclmul = (exx[2] & (1 << 1)) != 0; /* PCLMULQDQ */
+
+ return sse42 && pclmul;
}
/*
@@ -53,7 +56,7 @@ pg_crc32c_sse42_available(void)
static pg_crc32c
pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
- if (pg_crc32c_sse42_available())
+ if (pg_crc32c_sse42_pclmul_available())
pg_comp_crc32c = pg_comp_crc32c_sse42;
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
--
2.48.1
v5-0008-Allow-dev-test-to-build-on-Windows-for-CI-XXX-not.patchapplication/x-patch; name=v5-0008-Allow-dev-test-to-build-on-Windows-for-CI-XXX-not.patchDownload
From 2c8289de7f612ac01e9bbe5cc86a39571d171925 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Thu, 13 Feb 2025 15:53:20 +0700
Subject: [PATCH v5 8/8] Allow dev test to build on Windows for CI XXX not for
commit
---
src/port/pg_crc32c_sse42_choose.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 95cfe63493..5833b92638 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -64,4 +64,4 @@ pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
return pg_comp_crc32c(crc, data, len);
}
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
+PGDLLIMPORT pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
--
2.48.1
v5-0006-Unroll-tail.patchapplication/x-patch; name=v5-0006-Unroll-tail.patchDownload
From aefccb195cfec8532d85695768fdd49faae17a46 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Thu, 13 Feb 2025 13:52:54 +0700
Subject: [PATCH v5 6/8] Unroll tail
---
src/port/pg_crc32c_sse42.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 05b11b47cb..0fb9c16dfc 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -23,6 +23,9 @@
#include "port/pg_crc32c.h"
+#define PCLMUL_THRESHOLD 128
+#define CRC_CASE(n) do {crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) (p - (n)*sizeof(uint64_t))));} while(0)
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
static pg_crc32c
@@ -39,10 +42,30 @@ pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
* the begin address.
*/
#ifdef __x86_64__
- while (p + 8 <= pend)
+
+ /* set p to end of last word boundary */
+ p = pend - len % (sizeof(uint64_t));
+ Assert (len < PCLMUL_THRESHOLD);
+
+ switch (len / sizeof(uint64_t))
{
- crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
- p += 8;
+ case 15: CRC_CASE(15); /* FALLTHROUGH */
+ case 14: CRC_CASE(14); /* FALLTHROUGH */
+ case 13: CRC_CASE(13); /* FALLTHROUGH */
+ case 12: CRC_CASE(12); /* FALLTHROUGH */
+ case 11: CRC_CASE(11); /* FALLTHROUGH */
+ case 10: CRC_CASE(10); /* FALLTHROUGH */
+ case 9: CRC_CASE(9); /* FALLTHROUGH */
+ case 8: CRC_CASE(8); /* FALLTHROUGH */
+ case 7: CRC_CASE(7); /* FALLTHROUGH */
+ case 6: CRC_CASE(6); /* FALLTHROUGH */
+ case 5: CRC_CASE(5); /* FALLTHROUGH */
+ case 4: CRC_CASE(4); /* FALLTHROUGH */
+ case 3: CRC_CASE(3); /* FALLTHROUGH */
+ case 2: CRC_CASE(2); /* FALLTHROUGH */
+ case 1: CRC_CASE(1); /* FALLTHROUGH */
+ case 0: break;
+ default: pg_unreachable();
}
/* Process remaining full four bytes if any */
@@ -90,7 +113,7 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
size_t len = length;
const unsigned char *buf = data;
- if (len >= 128)
+ if (len >= PCLMUL_THRESHOLD)
{
/* First vector chunk. */
__m128i x0 = _mm_loadu_si128((const __m128i *) buf),
--
2.48.1
v5-0007-Fix-32-bit-build.patchapplication/x-patch; name=v5-0007-Fix-32-bit-build.patchDownload
From 2da25b18739c95384d236c783c188526e7f5f641 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Thu, 13 Feb 2025 15:40:02 +0700
Subject: [PATCH v5 7/8] Fix 32-bit build
---
src/port/pg_crc32c_sse42.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 0fb9c16dfc..fe7e8165ec 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -113,6 +113,7 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
size_t len = length;
const unsigned char *buf = data;
+#if SIZEOF_VOID_P >= 8
if (len >= PCLMUL_THRESHOLD)
{
/* First vector chunk. */
@@ -160,6 +161,7 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
}
+#endif /* SIZEOF_VOID_P */
return pg_comp_crc32c_sse42_tail(crc0, buf, len);
}
--
2.48.1
v5-0004-Run-pgindent-XXX-Some-lines-are-still-really-long.patchapplication/x-patch; name=v5-0004-Run-pgindent-XXX-Some-lines-are-still-really-long.patchDownload
From a09e918bab5b6aac134c28bebd4b6f60ed05bfc9 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 16:03:52 +0700
Subject: [PATCH v5 4/8] Run pgindent XXX Some lines are still really long
---
src/port/pg_crc32c_sse42.c | 95 +++++++++++++++++++++-----------------
1 file changed, 53 insertions(+), 42 deletions(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 3395617301..7250eccf6b 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -79,49 +79,60 @@ pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
pg_attribute_target("sse4.2,pclmul")
pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length) {
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length)
+{
/* adjust names to match generated code */
- pg_crc32c crc0 = crc;
- size_t len = length;
+ pg_crc32c crc0 = crc;
+ size_t len = length;
const unsigned char *buf = data;
- if (len >= 128) {
- /* First vector chunk. */
- __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
- __m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
- __m128i x2 = _mm_loadu_si128((const __m128i*)(buf + 32)), y2;
- __m128i x3 = _mm_loadu_si128((const __m128i*)(buf + 48)), y3;
- __m128i k;
- k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
- x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
- buf += 64;
- len -= 64;
- /* Main loop. */
- while (len >= 64) {
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
- y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
- y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
- y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
- y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i*)(buf + 16))), x1 = _mm_xor_si128(x1, y1);
- y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i*)(buf + 32))), x2 = _mm_xor_si128(x2, y2);
- y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i*)(buf + 48))), x3 = _mm_xor_si128(x3, y3);
- buf += 64;
- len -= 64;
- }
- /* Reduce x0 ... x3 to just x0. */
- k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
- y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
- y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
- k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
- /* Reduce 128 bits to 32 bits, and multiply by x^32. */
- crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
- crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
- }
-
- return pg_comp_crc32c_sse42_tail(crc0, buf, len);
+ if (len >= 128)
+ {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+
+ /* Main loop. */
+ while (len >= 64)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+
+ return pg_comp_crc32c_sse42_tail(crc0, buf, len);
}
--
2.48.1
v5-0002-Vendor-SSE-implementation-from-https-github.com-c.patchapplication/x-patch; name=v5-0002-Vendor-SSE-implementation-from-https-github.com-c.patchDownload
From 57952d1f89f0c3a4a2d28399344e9335f8bee72b Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v5 2/8] Vendor SSE implementation from
https://github.com/corsix/fast-crc32/
---
src/port/pg_crc32c_sse42.c | 77 ++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..6cc39de175 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -68,3 +68,80 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4 */
+/* MIT licensed */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <nmmintrin.h>
+#include <wmmintrin.h>
+
+#if defined(_MSC_VER)
+#define CRC_AINLINE static __forceinline
+#define CRC_ALIGN(n) __declspec(align(n))
+#else
+#define CRC_AINLINE static __inline __attribute__((always_inline))
+#define CRC_ALIGN(n) __attribute__((aligned(n)))
+#endif
+#define CRC_EXPORT extern
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+CRC_EXPORT uint32_t crc32_impl(uint32_t crc0, const char* buf, size_t len) {
+ crc0 = ~crc0;
+ for (; len && ((uintptr_t)buf & 7); --len) {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t)buf & 8) && len >= 8) {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
+ buf += 8;
+ len -= 8;
+ }
+ if (len >= 64) {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i*)(buf + 32)), y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i*)(buf + 48)), y3;
+ __m128i k;
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+ /* Main loop. */
+ while (len >= 64) {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i*)buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i*)(buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i*)(buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i*)(buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+ for (; len >= 8; buf += 8, len -= 8) {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
+ }
+ for (; len; --len) {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ return ~crc0;
+}
--
2.48.1
v5-0003-Adjust-previous-commit-to-match-our-style-add-128.patchapplication/x-patch; name=v5-0003-Adjust-previous-commit-to-match-our-style-add-128.patchDownload
From 543752f816e3f9f0e312dac2be14fabb7c56101e Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:27 +0700
Subject: [PATCH v5 3/8] Adjust previous commit to match our style, add
128-byte threshold
---
src/port/pg_crc32c_sse42.c | 48 +++++++++++---------------------------
1 file changed, 14 insertions(+), 34 deletions(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 6cc39de175..3395617301 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,13 +15,14 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
-pg_crc32c
-pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
+static pg_crc32c
+pg_comp_crc32c_sse42_tail(pg_crc32c crc, const void *data, size_t len)
{
const unsigned char *p = data;
const unsigned char *pend = p + len;
@@ -73,34 +74,18 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
/* ./generate -i sse -p crc32c -a v4 */
/* MIT licensed */
-#include <stddef.h>
-#include <stdint.h>
-#include <nmmintrin.h>
-#include <wmmintrin.h>
-
-#if defined(_MSC_VER)
-#define CRC_AINLINE static __forceinline
-#define CRC_ALIGN(n) __declspec(align(n))
-#else
-#define CRC_AINLINE static __inline __attribute__((always_inline))
-#define CRC_ALIGN(n) __attribute__((aligned(n)))
-#endif
-#define CRC_EXPORT extern
-
#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
-CRC_EXPORT uint32_t crc32_impl(uint32_t crc0, const char* buf, size_t len) {
- crc0 = ~crc0;
- for (; len && ((uintptr_t)buf & 7); --len) {
- crc0 = _mm_crc32_u8(crc0, *buf++);
- }
- if (((uintptr_t)buf & 8) && len >= 8) {
- crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
- buf += 8;
- len -= 8;
- }
- if (len >= 64) {
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t length) {
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const unsigned char *buf = data;
+
+ if (len >= 128) {
/* First vector chunk. */
__m128i x0 = _mm_loadu_si128((const __m128i*)buf), y0;
__m128i x1 = _mm_loadu_si128((const __m128i*)(buf + 16)), y1;
@@ -137,11 +122,6 @@ CRC_EXPORT uint32_t crc32_impl(uint32_t crc0, const char* buf, size_t len) {
crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
}
- for (; len >= 8; buf += 8, len -= 8) {
- crc0 = _mm_crc32_u64(crc0, *(const uint64_t*)buf);
- }
- for (; len; --len) {
- crc0 = _mm_crc32_u8(crc0, *buf++);
- }
- return ~crc0;
+
+ return pg_comp_crc32c_sse42_tail(crc0, buf, len);
}
--
2.48.1
v5-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchapplication/x-patch; name=v5-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From 3a27b748ec17feff4547d7ab2689d80ba6d55665 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v5 1/8] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67..06673db062 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 0000000000..5b747c6184
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 0000000000..dff6bb3133
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 0000000000..d7bec4ba1c
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 0000000000..95c6dfe448
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 0000000000..52b9772f90
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 0000000000..b350caf5ce
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT64(0);
+ int64 num = PG_GETARG_INT64(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 0000000000..878a077ee1
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
On Thu, Feb 13, 2025 at 5:19 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Feb 12, 2025 at 10:12:20PM +0000, Devulapalli, Raghuveer wrote:
Well, I suspect the AVX-512 version will pretty much always need the runtime
check given that its not available on a lot of newer hardware and requires a
bunch of extra runtime checks (see pg_popcount_avx512.c). But it might be
worth doing for PCLMUL. Otherwise, I think we'd have to leave out the PCLMUL
optimizations if built with -msse4.2 -mpclmul because we don't want to regress
existing -msse4.2 users with a runtime check.Sounds good to me. Although, users building with just -msse4.2 will now encounter an
an additional pclmul runtime check. That would be a regression unless they update to
building with both -msse4.2 and -mpclmul.My thinking was that building with just -msse4.2 would cause the existing
SSE 4.2 implementation to be used (without the function pointer). That's
admittedly a bit goofy because they'd miss out on the PCLMUL optimization,
but things at least don't get any worse for them.
I tried using branching for the runtime check, and this looks like the
way to go:
- Existing -msse4.2 builders will still call directly, but inside the
function there is a length check and only for long input will it do a
runtime check for pclmul.
- This smooths the way for -msse4.2 (and the equivalent on Arm) to
inline calls with short constant input (e.g. WAL insert lock),
although I've not done that here.
- This can be a simple starting point for consolidating runtime
checks, as was proposed for popcount in the AVX-512 CRC thread, but
with branching my model was Andres' sketch here:
/messages/by-id/20240731023918.ixsfbeuub6e76one@awork3.anarazel.de
--
John Naylor
Amazon Web Services
Attachments:
v6-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchtext/x-patch; charset=US-ASCII; name=v6-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From f327b7fcb588100d2dc7483369cfd36380210715 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v6 2/3] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67..06673db062 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 0000000000..5b747c6184
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 0000000000..dff6bb3133
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 0000000000..d7bec4ba1c
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 0000000000..95c6dfe448
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 0000000000..52b9772f90
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 0000000000..b350caf5ce
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT64(0);
+ int64 num = PG_GETARG_INT64(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 0000000000..878a077ee1
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
v6-0003-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v6-0003-Improve-CRC32C-performance-on-x86_64.patchDownload
From d3ee691a067fb1f41d3e8e4377df1a67962cf5c7 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v6 3/3] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
src/include/port/pg_cpu.h | 1 +
src/include/port/pg_crc32c.h | 1 +
src/port/pg_cpu.c | 3 +
src/port/pg_crc32c_sse42.c | 95 +++++++++++++++++++++++++++
src/port/pg_crc32c_sse42_choose.c | 16 +++++
src/test/regress/expected/strings.out | 24 +++++++
src/test/regress/sql/strings.sql | 4 ++
7 files changed, 144 insertions(+)
diff --git a/src/include/port/pg_cpu.h b/src/include/port/pg_cpu.h
index 45ce9d3c50..0d8137ebb3 100644
--- a/src/include/port/pg_cpu.h
+++ b/src/include/port/pg_cpu.h
@@ -16,6 +16,7 @@
#define PGCPUCAP_INIT (1 << 0)
#define PGCPUCAP_CRC32C (1 << 1)
+#define PGCPUCAP_CLMUL (1 << 2)
extern uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index db155d690e..068a653605 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -58,6 +58,7 @@ extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len)
#endif
extern bool pg_crc32c_sse42_available(void);
+extern bool pg_crc32c_pclmul_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
diff --git a/src/port/pg_cpu.c b/src/port/pg_cpu.c
index c948335743..52944b2d4e 100644
--- a/src/port/pg_cpu.c
+++ b/src/port/pg_cpu.c
@@ -31,6 +31,9 @@ pg_cpucap_crc32c(void)
if (pg_crc32c_sse42_available())
pg_cpucap |= PGCPUCAP_CRC32C;
+ if (pg_crc32c_pclmul_available())
+ pg_cpucap |= PGCPUCAP_CLMUL;
+
#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
if (pg_crc32c_armv8_available())
pg_cpucap |= PGCPUCAP_CRC32C;
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..66ddb7ec87 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,9 +15,19 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
+static pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length);
+
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define HAVE_PCLMUL_RUNTIME
+#endif
+
+#define PCLMUL_THRESHOLD 128
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
@@ -25,6 +35,17 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
{
const unsigned char *p = data;
const unsigned char *pend = p + len;
+ const pg_crc32c orig_crc = crc; /* XXX not for commit */
+ const size_t orig_len = len;
+
+#ifdef HAVE_PCLMUL_RUNTIME
+ if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
+ {
+ crc = pg_comp_crc32c_pclmul(crc, data, len);
+ len %= 64;
+ p = pend - len;
+ }
+#endif
/*
* Process eight bytes of data at a time.
@@ -66,5 +87,79 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
p++;
}
+ /* XXX not for commit */
+ Assert(crc == pg_comp_crc32c_sb8(orig_crc, data, orig_len));
+
return crc;
}
+
+#ifdef HAVE_PCLMUL_RUNTIME
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4 */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+static pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const unsigned char *buf = data;
+
+ if (len >= 64)
+ {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+
+ /* Main loop. */
+ while (len >= 64)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+
+ return crc0;
+}
+
+#endif
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index f4d3215bc5..59a5a31c00 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -40,3 +40,19 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+
+bool
+pg_crc32c_pclmul_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+ return (exx[2] & (1 << 1)) != 0; /* PCLMUL */
+}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b65bb2d536..662bd37ace 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2282,6 +2282,30 @@ SELECT crc32c('The quick brown fox jumps over the lazy dog.');
419469235
(1 row)
+SELECT crc32c(repeat('A', 80)::bytea);
+ crc32c
+------------
+ 3799127650
+(1 row)
+
+SELECT crc32c(repeat('A', 127)::bytea);
+ crc32c
+-----------
+ 291820082
+(1 row)
+
+SELECT crc32c(repeat('A', 128)::bytea);
+ crc32c
+-----------
+ 816091258
+(1 row)
+
+SELECT crc32c(repeat('A', 129)::bytea);
+ crc32c
+------------
+ 4213642571
+(1 row)
+
--
-- encode/decode
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 8e0f3a0e75..26f86dc92e 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -727,6 +727,10 @@ SELECT crc32('The quick brown fox jumps over the lazy dog.');
SELECT crc32c('');
SELECT crc32c('The quick brown fox jumps over the lazy dog.');
+SELECT crc32c(repeat('A', 80)::bytea);
+SELECT crc32c(repeat('A', 127)::bytea);
+SELECT crc32c(repeat('A', 128)::bytea);
+SELECT crc32c(repeat('A', 129)::bytea);
--
-- encode/decode
--
2.48.1
v6-0001-Dispatch-CRC-computation-by-branching-rather-than.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Dispatch-CRC-computation-by-branching-rather-than.patchDownload
From 527e157966e9ce0df8aae6aac8ed833af9cd53fb Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Sat, 15 Feb 2025 19:18:16 +0700
Subject: [PATCH v6 1/3] Dispatch CRC computation by branching rather than
indirect calls
---
configure | 2 +-
configure.ac | 2 +-
src/backend/postmaster/postmaster.c | 4 ++
src/bin/pg_basebackup/pg_basebackup.c | 3 +
src/bin/pg_basebackup/pg_createsubscriber.c | 3 +
src/bin/pg_checksums/pg_checksums.c | 3 +
src/bin/pg_combinebackup/pg_combinebackup.c | 3 +
src/bin/pg_controldata/pg_controldata.c | 3 +
src/bin/pg_ctl/pg_ctl.c | 3 +
src/bin/pg_resetwal/pg_resetwal.c | 3 +
src/bin/pg_rewind/pg_rewind.c | 3 +
src/bin/pg_verifybackup/pg_verifybackup.c | 3 +
src/bin/pg_waldump/pg_waldump.c | 3 +
src/bin/pg_walsummary/pg_walsummary.c | 4 ++
src/include/port/pg_cpu.h | 23 ++++++
src/include/port/pg_crc32c.h | 78 +++++++++++++++------
src/port/Makefile | 1 +
src/port/meson.build | 4 ++
src/port/pg_cpu.c | 54 ++++++++++++++
src/port/pg_crc32c_armv8_choose.c | 26 +------
src/port/pg_crc32c_sse42_choose.c | 26 +------
21 files changed, 182 insertions(+), 72 deletions(-)
create mode 100644 src/include/port/pg_cpu.h
create mode 100644 src/port/pg_cpu.c
diff --git a/configure b/configure
index 0ffcaeb436..41aad7b4d7 100755
--- a/configure
+++ b/configure
@@ -17352,7 +17352,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
$as_echo "SSE 4.2" >&6; }
else
diff --git a/configure.ac b/configure.ac
index f56681e0d9..efa8249360 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2110,7 +2110,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bb22b13ade..c218f15f97 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -99,6 +99,7 @@
#include "pg_getopt.h"
#include "pgstat.h"
#include "port/pg_bswap.h"
+#include "port/pg_cpu.h"
#include "postmaster/autovacuum.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/pgarch.h"
@@ -1951,6 +1952,9 @@ InitProcessGlobals(void)
#ifndef WIN32
srandom(pg_prng_uint32(&pg_global_prng_state));
#endif
+
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
}
/*
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index dc0c805137..8d4b3718b6 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -2405,6 +2405,9 @@ main(int argc, char **argv)
progname = get_progname(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index 2d881d54f5..04e550ef75 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1906,6 +1906,9 @@ main(int argc, char **argv)
progname = get_progname(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index e1acb6e933..eb88aeedb5 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -453,6 +453,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_checksums"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574f..ee24dba231 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -166,6 +166,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_combinebackup"));
handle_help_version_opts(argc, argv, progname, help);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
memset(&opt, 0, sizeof(opt));
opt.manifest_checksums = CHECKSUM_TYPE_CRC32C;
opt.sync_method = DATA_DIR_SYNC_METHOD_FSYNC;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index cf11ab3f2e..deb6b16ae6 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -112,6 +112,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_controldata"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 8a405ff122..7dc4da932e 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -2226,6 +2226,9 @@ main(int argc, char **argv)
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_ctl"));
start_time = time(NULL);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
/*
* save argv[0] so do_start() can look for the postmaster if necessary. we
* don't look for postmaster here because in many cases we won't need it.
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index ed73607a46..52bcaadf69 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -123,6 +123,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_resetwal"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index cae81cd6cb..f6c755883c 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -158,6 +158,9 @@ main(int argc, char **argv)
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_rewind"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
/* Process command-line arguments */
if (argc > 1)
{
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 7c720ab98b..d44a87e83a 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -144,6 +144,9 @@ main(int argc, char **argv)
memset(&context, 0, sizeof(context));
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 51fb76efc4..10c529a5fa 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -835,6 +835,9 @@ main(int argc, char **argv)
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_waldump"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_walsummary/pg_walsummary.c b/src/bin/pg_walsummary/pg_walsummary.c
index cd7a6b147b..a38565ea6d 100644
--- a/src/bin/pg_walsummary/pg_walsummary.c
+++ b/src/bin/pg_walsummary/pg_walsummary.c
@@ -20,6 +20,7 @@
#include "common/logging.h"
#include "fe_utils/option_utils.h"
#include "getopt_long.h"
+#include "port/pg_cpu.h"
typedef struct ws_options
{
@@ -69,6 +70,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_walsummary"));
handle_help_version_opts(argc, argv, progname, help);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
/* process command-line options */
while ((c = getopt_long(argc, argv, "iq",
long_options, &optindex)) != -1)
diff --git a/src/include/port/pg_cpu.h b/src/include/port/pg_cpu.h
new file mode 100644
index 0000000000..45ce9d3c50
--- /dev/null
+++ b/src/include/port/pg_cpu.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpu.h
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/include/port/pg_cpu.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CPU_H
+#define PG_CPU_H
+
+#define PGCPUCAP_INIT (1 << 0)
+#define PGCPUCAP_CRC32C (1 << 1)
+
+extern uint32 pg_cpucap;
+extern void pg_cpucap_initialize(void);
+
+#endif /* PG_CPU_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b..db155d690e 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -34,6 +34,7 @@
#define PG_CRC32C_H
#include "port/pg_bswap.h"
+#include "port/pg_cpu.h"
typedef uint32 pg_crc32c;
@@ -41,52 +42,55 @@ typedef uint32 pg_crc32c;
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
-#if defined(USE_SSE42_CRC32C)
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
/* Use Intel SSE4.2 instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_SSE42_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_ARMV8_CRC32C)
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/* Use ARMv8 CRC Extension instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_armv8((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_ARMV8_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
/* Use LoongArch CRCC instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_loongarch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#define HAVE_CRC_COMPTIME
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
-
-/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
- * to check that they are available.
- */
-#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c((crc), (data), (len)))
-#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
-
-extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
-extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
-
#else
/*
* Use slicing-by-8 algorithm.
@@ -105,6 +109,36 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif /* end of CPU-specfic symbols */
+
+#if defined(HAVE_CRC_COMPTIME) || defined(HAVE_CRC_RUNTIME)
+/*
+ * Check if the CPU we're running on supports special
+ * instructions for CRC-32C computation. Otherwise, fall
+ * back to the pure software implementation (slicing-by-8).
+ */
+static inline pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ /*
+ * If this is firing in a frontend program, first look if you forgot a
+ * call to pg_cpucap_initialize() in main(). See for example
+ * src/bin/pg_controldata/pg_controldata.c.
+ */
+ Assert(pg_cpucap & PGCPUCAP_INIT);
+
+ {
+#if defined(HAVE_CRC_COMPTIME)
+ Assert(pg_cpucap & PGCPUCAP_CRC32C);
+ return COMP_CRC32C_HW(crc, data, len);
+#else
+ if (pg_cpucap & PGCPUCAP_CRC32C)
+ return COMP_CRC32C_HW(crc, data, len);
+ else
+ return pg_comp_crc32c_sb8(crc, data, len);
#endif
+ }
+}
+#endif /* HAVE_CRC_COMPTIME || HAVE_CRC_RUNTIME */
#endif /* PG_CRC32C_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index 4c22431951..2ac79ecb0f 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -44,6 +44,7 @@ OBJS = \
noblock.o \
path.o \
pg_bitutils.o \
+ pg_cpu.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d4..02ae206760 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -7,6 +7,7 @@ pgport_sources = [
'noblock.c',
'path.c',
'pg_bitutils.c',
+ 'pg_cpu.c',
'pg_popcount_avx512.c',
'pg_strong_random.c',
'pgcheckdir.c',
@@ -83,12 +84,15 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ # WIP sometime we'll need to build these based on host_cpu
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_cpu.c b/src/port/pg_cpu.c
new file mode 100644
index 0000000000..c948335743
--- /dev/null
+++ b/src/port/pg_cpu.c
@@ -0,0 +1,54 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpu.c
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/port/pg_cpu.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+
+#include "port/pg_cpu.h"
+#include "port/pg_crc32c.h"
+
+
+/* starts uninitialized so we can detect errors of omission */
+uint32 pg_cpucap = 0;
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+static void
+pg_cpucap_crc32c(void)
+{
+ /* WIP: It seems like we should use CPU arch symbols instead */
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+
+#elif defined(USE_LOONGARCH_CRC32C)
+ pg_cpucap |= PGCPUCAP_CRC32C;
+#endif
+}
+
+/*
+ * This needs to be called in main() for every
+ * program that calls a function that dispatches
+ * according to CPU features.
+ */
+void
+pg_cpucap_initialize(void)
+{
+ pg_cpucap = PGCPUCAP_INIT;
+
+ pg_cpucap_crc32c();
+}
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_crc32c_armv8_choose.c
index ec12be1bbc..e3654427c3 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_crc32c_armv8_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_armv8_choose.c
- * Choose between ARMv8 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports the ARMv8
- * CRC Extension. If it does, use the special instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -40,7 +35,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -106,20 +101,3 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_armv8_available())
- pg_comp_crc32c = pg_comp_crc32c_armv8;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d424..f4d3215bc5 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_sse42_choose.c
- * Choose between Intel SSE 4.2 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports Intel SSE
- * 4.2. If it does, use the special SSE instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -30,7 +25,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -45,20 +40,3 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
- pg_comp_crc32c = pg_comp_crc32c_sse42;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
--
2.48.1
On Mon, Feb 17, 2025 at 05:58:01PM +0700, John Naylor wrote:
I tried using branching for the runtime check, and this looks like the
way to go:
- Existing -msse4.2 builders will still call directly, but inside the
function there is a length check and only for long input will it do a
runtime check for pclmul.
- This smooths the way for -msse4.2 (and the equivalent on Arm) to
inline calls with short constant input (e.g. WAL insert lock),
although I've not done that here.
- This can be a simple starting point for consolidating runtime
checks, as was proposed for popcount in the AVX-512 CRC thread, but
with branching my model was Andres' sketch here:/messages/by-id/20240731023918.ixsfbeuub6e76one@awork3.anarazel.de
Oh, nifty. IIUC this should help avoid quite a bit of overhead even before
adding the PCLMUL stuff since it removes the function pointers for the
runtime-check builds.
While this needn't block this patch set, I do find the dispatch code to be
pretty complicated. Maybe we can improve that in the future by using
macros to auto-generate much of it. My concern here is less about this
particular patch set and more about the long term maintainability as we add
more and more stuff like it, each with its own tangled web of build and
dispatch rules.
--
nathan
On Tue, Feb 18, 2025 at 12:41 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
On Mon, Feb 17, 2025 at 05:58:01PM +0700, John Naylor wrote:
I tried using branching for the runtime check, and this looks like the
way to go:
- Existing -msse4.2 builders will still call directly, but inside the
function there is a length check and only for long input will it do a
runtime check for pclmul.
- This smooths the way for -msse4.2 (and the equivalent on Arm) to
inline calls with short constant input (e.g. WAL insert lock),
although I've not done that here.
- This can be a simple starting point for consolidating runtime
checks, as was proposed for popcount in the AVX-512 CRC thread, but
with branching my model was Andres' sketch here:/messages/by-id/20240731023918.ixsfbeuub6e76one@awork3.anarazel.de
Oh, nifty. IIUC this should help avoid quite a bit of overhead even before
adding the PCLMUL stuff since it removes the function pointers for the
runtime-check builds.
I figured the same, but it doesn't seem to help on the microbenchmark.
(I also tried the pg_waldump benchmark you used, but I couldn't get it
working so I'm probably missing a step.)
master:
20
latency average = 3.958 ms
latency average = 3.860 ms
latency average = 3.857 ms
32
latency average = 5.921 ms
latency average = 5.166 ms
latency average = 5.128 ms
48
latency average = 6.384 ms
latency average = 6.022 ms
latency average = 5.991 ms
v6:
20
latency average = 4.084 ms
latency average = 3.879 ms
latency average = 3.896 ms
32
latency average = 5.533 ms
latency average = 5.536 ms
latency average = 5.573 ms
48
latency average = 6.201 ms
latency average = 6.205 ms
latency average = 6.111 ms
While this needn't block this patch set, I do find the dispatch code to be
pretty complicated. Maybe we can improve that in the future by using
macros to auto-generate much of it. My concern here is less about this
particular patch set and more about the long term maintainability as we add
more and more stuff like it, each with its own tangled web of build and
dispatch rules.
I think the runtime dispatch is fairly simple for this case. To my
taste, the worse maintainability hazard is the building. To make that
better, I'd do this:
- Rename the CRC choose*.c files to pg_cpucap_{x86,arm}.c and build
them unconditionally for each platform
- Initialize the runtime info by CPU platform and not other symbols
where possible (I guess anything needing AVX-512 will still be a mess)
- Put all hardware CRC .c files into a single file pg_crc32c_hw.c.
Define PG_CRC_HW8/4/2/1 macros in pg_crc32c.h that tell which
intrinsic to use by platform and have a separate pg_crc32c_hw_impl.h
header that has the simple loop with these macros. That header would
for now only be included into pg_crc32c_hw.c.
The first two could be done as part of this patch or soon after. The
third is a bit more invasive, but seems like a necessary prerequisite
for inlining on small constant input, to keep that from being a mess.
There's another potential irritation for maintenance too: I spent too
much time only adding pg_cpucap_initialize() to frontend main()
functions that need it. I should have just added them to every binary.
We don't add new programs often, but it's still less automatic than
the function pointer way, so I'm open to other ideas.
Attached v7 to fix CI failures.
--
John Naylor
Amazon Web Services
Attachments:
v7-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchtext/x-patch; charset=US-ASCII; name=v7-0002-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From 3db619000b87aeec3ee7a3c3387f7f64b06b9980 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v7 2/3] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67a..06673db0625 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 00000000000..5b747c6184a
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 00000000000..dff6bb3133b
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 00000000000..d7bec4ba1cb
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 00000000000..95c6dfe4488
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 00000000000..52b9772f908
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 00000000000..28bc42de314
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 00000000000..878a077ee18
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
v7-0003-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v7-0003-Improve-CRC32C-performance-on-x86_64.patchDownload
From 309d122c770c967445bc9a0f5527ab6794bd5ba3 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v7 3/3] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
src/include/port/pg_cpu.h | 3 +-
src/include/port/pg_crc32c.h | 1 +
src/port/pg_cpu.c | 3 +
src/port/pg_crc32c_sse42.c | 101 ++++++++++++++++++++++++++
src/port/pg_crc32c_sse42_choose.c | 16 ++++
src/test/regress/expected/strings.out | 24 ++++++
src/test/regress/sql/strings.sql | 4 +
7 files changed, 151 insertions(+), 1 deletion(-)
diff --git a/src/include/port/pg_cpu.h b/src/include/port/pg_cpu.h
index 45ce9d3c505..445f0031eb2 100644
--- a/src/include/port/pg_cpu.h
+++ b/src/include/port/pg_cpu.h
@@ -16,8 +16,9 @@
#define PGCPUCAP_INIT (1 << 0)
#define PGCPUCAP_CRC32C (1 << 1)
+#define PGCPUCAP_CLMUL (1 << 2)
-extern uint32 pg_cpucap;
+extern PGDLLIMPORT uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
#endif /* PG_CPU_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index db155d690e3..068a6536059 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -58,6 +58,7 @@ extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len)
#endif
extern bool pg_crc32c_sse42_available(void);
+extern bool pg_crc32c_pclmul_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
diff --git a/src/port/pg_cpu.c b/src/port/pg_cpu.c
index c9483357436..52944b2d4ed 100644
--- a/src/port/pg_cpu.c
+++ b/src/port/pg_cpu.c
@@ -31,6 +31,9 @@ pg_cpucap_crc32c(void)
if (pg_crc32c_sse42_available())
pg_cpucap |= PGCPUCAP_CRC32C;
+ if (pg_crc32c_pclmul_available())
+ pg_cpucap |= PGCPUCAP_CLMUL;
+
#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
if (pg_crc32c_armv8_available())
pg_cpucap |= PGCPUCAP_CRC32C;
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df31..ce2acf1eec1 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,9 +15,94 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define HAVE_PCLMUL_RUNTIME
+#endif
+
+ /*
+ * WIP: Testing has shown that on Kaby Lake (2016) this algorithm needs two
+ * iterations of the main loop to be faster than using regular CRC
+ * instrutions, but Tiger Lake (2020) is fine with a single iteration. Could
+ * use more testing between those years (on AMD as well).
+ */
+#define PCLMUL_THRESHOLD 128
+
+#ifdef HAVE_PCLMUL_RUNTIME
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4 */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+static pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const unsigned char *buf = data;
+
+ if (len >= 64)
+ {
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ len -= 64;
+
+ /* Main loop. */
+ while (len >= 64)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ len -= 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ }
+
+ return crc0;
+}
+
+#endif
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
@@ -26,6 +111,19 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
const unsigned char *p = data;
const unsigned char *pend = p + len;
+ /* XXX not for commit */
+ const pg_crc32c orig_crc PG_USED_FOR_ASSERTS_ONLY = crc;
+ const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
+
+#ifdef HAVE_PCLMUL_RUNTIME
+ if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
+ {
+ crc = pg_comp_crc32c_pclmul(crc, data, len);
+ len %= 64;
+ p = pend - len;
+ }
+#endif
+
/*
* Process eight bytes of data at a time.
*
@@ -66,5 +164,8 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
p++;
}
+ /* XXX not for commit */
+ Assert(crc == pg_comp_crc32c_sb8(orig_crc, data, orig_len));
+
return crc;
}
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index f4d3215bc55..59a5a31c006 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -40,3 +40,19 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+
+bool
+pg_crc32c_pclmul_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+ return (exx[2] & (1 << 1)) != 0; /* PCLMUL */
+}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b65bb2d5368..662bd37ace6 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2282,6 +2282,30 @@ SELECT crc32c('The quick brown fox jumps over the lazy dog.');
419469235
(1 row)
+SELECT crc32c(repeat('A', 80)::bytea);
+ crc32c
+------------
+ 3799127650
+(1 row)
+
+SELECT crc32c(repeat('A', 127)::bytea);
+ crc32c
+-----------
+ 291820082
+(1 row)
+
+SELECT crc32c(repeat('A', 128)::bytea);
+ crc32c
+-----------
+ 816091258
+(1 row)
+
+SELECT crc32c(repeat('A', 129)::bytea);
+ crc32c
+------------
+ 4213642571
+(1 row)
+
--
-- encode/decode
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 8e0f3a0e75f..26f86dc92e0 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -727,6 +727,10 @@ SELECT crc32('The quick brown fox jumps over the lazy dog.');
SELECT crc32c('');
SELECT crc32c('The quick brown fox jumps over the lazy dog.');
+SELECT crc32c(repeat('A', 80)::bytea);
+SELECT crc32c(repeat('A', 127)::bytea);
+SELECT crc32c(repeat('A', 128)::bytea);
+SELECT crc32c(repeat('A', 129)::bytea);
--
-- encode/decode
--
2.48.1
v7-0001-Dispatch-CRC-computation-by-branching-rather-than.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Dispatch-CRC-computation-by-branching-rather-than.patchDownload
From 1337d15dc73b247d8c35f2d8e22400d0c292ccc0 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Sat, 15 Feb 2025 19:18:16 +0700
Subject: [PATCH v7 1/3] Dispatch CRC computation by branching rather than
indirect calls
---
configure | 2 +-
configure.ac | 2 +-
src/backend/postmaster/postmaster.c | 4 ++
src/bin/pg_basebackup/pg_basebackup.c | 3 +
src/bin/pg_basebackup/pg_createsubscriber.c | 3 +
src/bin/pg_checksums/pg_checksums.c | 3 +
src/bin/pg_combinebackup/pg_combinebackup.c | 3 +
src/bin/pg_controldata/pg_controldata.c | 3 +
src/bin/pg_ctl/pg_ctl.c | 3 +
src/bin/pg_resetwal/pg_resetwal.c | 3 +
src/bin/pg_rewind/pg_rewind.c | 3 +
src/bin/pg_verifybackup/pg_verifybackup.c | 3 +
src/bin/pg_waldump/pg_waldump.c | 3 +
src/bin/pg_walsummary/pg_walsummary.c | 4 ++
src/include/port/pg_cpu.h | 23 ++++++
src/include/port/pg_crc32c.h | 78 +++++++++++++++------
src/port/Makefile | 1 +
src/port/meson.build | 4 ++
src/port/pg_cpu.c | 54 ++++++++++++++
src/port/pg_crc32c_armv8_choose.c | 26 +------
src/port/pg_crc32c_sse42_choose.c | 26 +------
21 files changed, 182 insertions(+), 72 deletions(-)
create mode 100644 src/include/port/pg_cpu.h
create mode 100644 src/port/pg_cpu.c
diff --git a/configure b/configure
index 0ffcaeb4367..41aad7b4d75 100755
--- a/configure
+++ b/configure
@@ -17352,7 +17352,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
$as_echo "SSE 4.2" >&6; }
else
diff --git a/configure.ac b/configure.ac
index f56681e0d91..efa8249360a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2110,7 +2110,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bb22b13adef..c218f15f97e 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -99,6 +99,7 @@
#include "pg_getopt.h"
#include "pgstat.h"
#include "port/pg_bswap.h"
+#include "port/pg_cpu.h"
#include "postmaster/autovacuum.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/pgarch.h"
@@ -1951,6 +1952,9 @@ InitProcessGlobals(void)
#ifndef WIN32
srandom(pg_prng_uint32(&pg_global_prng_state));
#endif
+
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
}
/*
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index dc0c805137a..8d4b3718b6e 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -2405,6 +2405,9 @@ main(int argc, char **argv)
progname = get_progname(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index 2d881d54f5b..04e550ef755 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1906,6 +1906,9 @@ main(int argc, char **argv)
progname = get_progname(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index e1acb6e933d..eb88aeedb53 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -453,6 +453,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_checksums"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574fb..ee24dba2315 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -166,6 +166,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_combinebackup"));
handle_help_version_opts(argc, argv, progname, help);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
memset(&opt, 0, sizeof(opt));
opt.manifest_checksums = CHECKSUM_TYPE_CRC32C;
opt.sync_method = DATA_DIR_SYNC_METHOD_FSYNC;
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index cf11ab3f2ee..deb6b16ae65 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -112,6 +112,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_controldata"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 8a405ff122c..7dc4da932e5 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -2226,6 +2226,9 @@ main(int argc, char **argv)
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_ctl"));
start_time = time(NULL);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
/*
* save argv[0] so do_start() can look for the postmaster if necessary. we
* don't look for postmaster here because in many cases we won't need it.
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index ed73607a46f..52bcaadf69f 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -123,6 +123,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_resetwal"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index cae81cd6cb1..f6c755883c6 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -158,6 +158,9 @@ main(int argc, char **argv)
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_rewind"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
/* Process command-line arguments */
if (argc > 1)
{
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 7c720ab98bd..d44a87e83ad 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -144,6 +144,9 @@ main(int argc, char **argv)
memset(&context, 0, sizeof(context));
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 51fb76efc48..10c529a5fac 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -835,6 +835,9 @@ main(int argc, char **argv)
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_waldump"));
progname = get_progname(argv[0]);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
if (argc > 1)
{
if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
diff --git a/src/bin/pg_walsummary/pg_walsummary.c b/src/bin/pg_walsummary/pg_walsummary.c
index cd7a6b147b2..a38565ea6d8 100644
--- a/src/bin/pg_walsummary/pg_walsummary.c
+++ b/src/bin/pg_walsummary/pg_walsummary.c
@@ -20,6 +20,7 @@
#include "common/logging.h"
#include "fe_utils/option_utils.h"
#include "getopt_long.h"
+#include "port/pg_cpu.h"
typedef struct ws_options
{
@@ -69,6 +70,9 @@ main(int argc, char *argv[])
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_walsummary"));
handle_help_version_opts(argc, argv, progname, help);
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
+
/* process command-line options */
while ((c = getopt_long(argc, argv, "iq",
long_options, &optindex)) != -1)
diff --git a/src/include/port/pg_cpu.h b/src/include/port/pg_cpu.h
new file mode 100644
index 00000000000..45ce9d3c505
--- /dev/null
+++ b/src/include/port/pg_cpu.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpu.h
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/include/port/pg_cpu.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CPU_H
+#define PG_CPU_H
+
+#define PGCPUCAP_INIT (1 << 0)
+#define PGCPUCAP_CRC32C (1 << 1)
+
+extern uint32 pg_cpucap;
+extern void pg_cpucap_initialize(void);
+
+#endif /* PG_CPU_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b1..db155d690e3 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -34,6 +34,7 @@
#define PG_CRC32C_H
#include "port/pg_bswap.h"
+#include "port/pg_cpu.h"
typedef uint32 pg_crc32c;
@@ -41,52 +42,55 @@ typedef uint32 pg_crc32c;
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
-#if defined(USE_SSE42_CRC32C)
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
/* Use Intel SSE4.2 instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_SSE42_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_ARMV8_CRC32C)
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/* Use ARMv8 CRC Extension instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_armv8((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_ARMV8_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
/* Use LoongArch CRCC instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_loongarch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#define HAVE_CRC_COMPTIME
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
-
-/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
- * to check that they are available.
- */
-#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c((crc), (data), (len)))
-#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
-
-extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
-extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
-
#else
/*
* Use slicing-by-8 algorithm.
@@ -105,6 +109,36 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif /* end of CPU-specfic symbols */
+
+#if defined(HAVE_CRC_COMPTIME) || defined(HAVE_CRC_RUNTIME)
+/*
+ * Check if the CPU we're running on supports special
+ * instructions for CRC-32C computation. Otherwise, fall
+ * back to the pure software implementation (slicing-by-8).
+ */
+static inline pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ /*
+ * If this is firing in a frontend program, first look if you forgot a
+ * call to pg_cpucap_initialize() in main(). See for example
+ * src/bin/pg_controldata/pg_controldata.c.
+ */
+ Assert(pg_cpucap & PGCPUCAP_INIT);
+
+ {
+#if defined(HAVE_CRC_COMPTIME)
+ Assert(pg_cpucap & PGCPUCAP_CRC32C);
+ return COMP_CRC32C_HW(crc, data, len);
+#else
+ if (pg_cpucap & PGCPUCAP_CRC32C)
+ return COMP_CRC32C_HW(crc, data, len);
+ else
+ return pg_comp_crc32c_sb8(crc, data, len);
#endif
+ }
+}
+#endif /* HAVE_CRC_COMPTIME || HAVE_CRC_RUNTIME */
#endif /* PG_CRC32C_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index 4c224319512..2ac79ecb0fd 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -44,6 +44,7 @@ OBJS = \
noblock.o \
path.o \
pg_bitutils.o \
+ pg_cpu.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d43..02ae2067604 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -7,6 +7,7 @@ pgport_sources = [
'noblock.c',
'path.c',
'pg_bitutils.c',
+ 'pg_cpu.c',
'pg_popcount_avx512.c',
'pg_strong_random.c',
'pgcheckdir.c',
@@ -83,12 +84,15 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ # WIP sometime we'll need to build these based on host_cpu
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_cpu.c b/src/port/pg_cpu.c
new file mode 100644
index 00000000000..c9483357436
--- /dev/null
+++ b/src/port/pg_cpu.c
@@ -0,0 +1,54 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpu.c
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/port/pg_cpu.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+
+#include "port/pg_cpu.h"
+#include "port/pg_crc32c.h"
+
+
+/* starts uninitialized so we can detect errors of omission */
+uint32 pg_cpucap = 0;
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+static void
+pg_cpucap_crc32c(void)
+{
+ /* WIP: It seems like we should use CPU arch symbols instead */
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+
+#elif defined(USE_LOONGARCH_CRC32C)
+ pg_cpucap |= PGCPUCAP_CRC32C;
+#endif
+}
+
+/*
+ * This needs to be called in main() for every
+ * program that calls a function that dispatches
+ * according to CPU features.
+ */
+void
+pg_cpucap_initialize(void)
+{
+ pg_cpucap = PGCPUCAP_INIT;
+
+ pg_cpucap_crc32c();
+}
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_crc32c_armv8_choose.c
index ec12be1bbc3..e3654427c3f 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_crc32c_armv8_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_armv8_choose.c
- * Choose between ARMv8 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports the ARMv8
- * CRC Extension. If it does, use the special instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -40,7 +35,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -106,20 +101,3 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_armv8_available())
- pg_comp_crc32c = pg_comp_crc32c_armv8;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d4249..f4d3215bc55 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_sse42_choose.c
- * Choose between Intel SSE 4.2 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports Intel SSE
- * 4.2. If it does, use the special SSE instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -30,7 +25,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -45,20 +40,3 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
- pg_comp_crc32c = pg_comp_crc32c_sse42;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
--
2.48.1
Hi John,
I added dispatch logic for the new pclmul version on top of your v5 (which seems outdated now and so I will refrain from posting a patch here). Could you take a look at the approach here to see if this makes sense? The logic in the pg_comp_crc32c_choose function is the main change and seems a lot cleaner to me.
See https://github.com/r-devulap/postgressql/pull/5/commits/cf80f7375f24d2fb5cd29e95dc73d4988fc6d066/
Raghuveer
Show quoted text
-----Original Message-----
From: John Naylor <johncnaylorls@gmail.com>
Sent: Monday, February 17, 2025 10:40 PM
To: Nathan Bossart <nathandbossart@gmail.com>
Cc: Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com>; pgsql-
hackers@lists.postgresql.org; Shankaran, Akash <akash.shankaran@intel.com>
Subject: Re: Improve CRC32C performance on SSE4.2On Tue, Feb 18, 2025 at 12:41 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:On Mon, Feb 17, 2025 at 05:58:01PM +0700, John Naylor wrote:
I tried using branching for the runtime check, and this looks like
the way to go:
- Existing -msse4.2 builders will still call directly, but inside
the function there is a length check and only for long input will it
do a runtime check for pclmul.
- This smooths the way for -msse4.2 (and the equivalent on Arm) to
inline calls with short constant input (e.g. WAL insert lock),
although I've not done that here.
- This can be a simple starting point for consolidating runtime
checks, as was proposed for popcount in the AVX-512 CRC thread, but
with branching my model was Andres' sketch here:/messages/by-id/20240731023918.ixsfbeuub6e76on
e%40awork3.anarazel.deOh, nifty. IIUC this should help avoid quite a bit of overhead even
before adding the PCLMUL stuff since it removes the function pointers
for the runtime-check builds.I figured the same, but it doesn't seem to help on the microbenchmark.
(I also tried the pg_waldump benchmark you used, but I couldn't get it working so
I'm probably missing a step.)master:
20
latency average = 3.958 ms
latency average = 3.860 ms
latency average = 3.857 ms
32
latency average = 5.921 ms
latency average = 5.166 ms
latency average = 5.128 ms
48
latency average = 6.384 ms
latency average = 6.022 ms
latency average = 5.991 msv6:
20
latency average = 4.084 ms
latency average = 3.879 ms
latency average = 3.896 ms
32
latency average = 5.533 ms
latency average = 5.536 ms
latency average = 5.573 ms
48
latency average = 6.201 ms
latency average = 6.205 ms
latency average = 6.111 msWhile this needn't block this patch set, I do find the dispatch code
to be pretty complicated. Maybe we can improve that in the future by
using macros to auto-generate much of it. My concern here is less
about this particular patch set and more about the long term
maintainability as we add more and more stuff like it, each with its
own tangled web of build and dispatch rules.I think the runtime dispatch is fairly simple for this case. To my taste, the worse
maintainability hazard is the building. To make that better, I'd do this:
- Rename the CRC choose*.c files to pg_cpucap_{x86,arm}.c and build them
unconditionally for each platform
- Initialize the runtime info by CPU platform and not other symbols where possible
(I guess anything needing AVX-512 will still be a mess)
- Put all hardware CRC .c files into a single file pg_crc32c_hw.c.
Define PG_CRC_HW8/4/2/1 macros in pg_crc32c.h that tell which intrinsic to use
by platform and have a separate pg_crc32c_hw_impl.h header that has the
simple loop with these macros. That header would for now only be included into
pg_crc32c_hw.c.The first two could be done as part of this patch or soon after. The third is a bit
more invasive, but seems like a necessary prerequisite for inlining on small
constant input, to keep that from being a mess.There's another potential irritation for maintenance too: I spent too much time
only adding pg_cpucap_initialize() to frontend main() functions that need it. I
should have just added them to every binary.
We don't add new programs often, but it's still less automatic than the function
pointer way, so I'm open to other ideas.Attached v7 to fix CI failures.
--
John Naylor
Amazon Web Services
Hi John,
Just a few comments on v7:
pg_cpucap_crc32c
I like the idea of moving all CPU runtime detection to a single module outside of implementations. This makes it easy to extend in the future and keeps the functionalities separate.
- Rename the CRC choose*.c files to pg_cpucap_{x86,arm}.c and build them
unconditionally for each platform
+1. Sounds perfect. We should also move the avx512 runtime detection of popcount here. The runtime detection code could also be appended with function __attribute__((constructor)) so that it gets executed before main.
I think the runtime dispatch is fairly simple for this case.
+ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
+ {
+ ....
+ #ifdef HAVE_PCLMUL_RUNTIME
+ if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
IMO, the pclmul and sse4.2 versions should be dispatched independently and the dispatch logic should be moved out of the pg_crc32c.h to it own pg_crc32c_dispatch.c file.
Also, thank you for taking lead on developing this patch. Since the latest patch seems to be in a good shape, I'm happy to provide feedback and review your work, or can continue development work based on any feedback. Please let me know which you'd prefer.
Raghuveer
On Wed, Feb 19, 2025 at 1:28 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
The runtime detection code could also be appended with function __attribute__((constructor)) so that it gets executed before main.
Hmm! I didn't know about that, thanks! It works on old gcc/clang, so
that's good. I can't verify online if Arm etc. executes properly since
the execute button is greyed out, but I don't see why it wouldn't:
https://godbolt.org/z/3as9MGr3G
Then we could have:
#ifdef FRONTEND
pg_attribute_constructor()
#endif
void
pg_cpucap_initialize(void)
{
...
}
Now, there is no equivalent on MSVC and workarounds are fragile [1]https://stackoverflow.com/questions/1113409/attribute-constructor-equivalent-in-vc.
Maybe we could only assert initialization happened for the backend and
for frontend either
- add a couple manual initializations to to the frontend programs
where we don't want to lose performance for non-gcc/clang.
- require CRC on x86-64 MSVC since Windows 10 is EOL soon, going by
Thomas M.'s earlier findings on popcount (also SSE4.2) [2]/messages/by-id/CA+hUKGKS64zJezV9y9mPcB-J0i+fLGiv3FAdwSH_3SCaVdrjyQ@mail.gmail.com
The first is less risky but less tidy.
I think the runtime dispatch is fairly simple for this case.
+ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len) + { + .... + #ifdef HAVE_PCLMUL_RUNTIME + if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))IMO, the pclmul and sse4.2 versions should be dispatched independently and the dispatch logic should be moved out of the pg_crc32c.h to it own pg_crc32c_dispatch.c file.
That makes sense, but it does close the door on future inlining.
Also, thank you for taking lead on developing this patch. Since the latest patch seems to be in a good shape, I'm happy to provide feedback and review your work, or can continue development work based on any feedback. Please let me know which you'd prefer.
Sure. Depending on any feedback on the above constructor technique,
I'll work on the following, since I've prototyped most of the first
and the second is mostly copy-and-paste from your earlier work:
pg_cpucap_crc32c
I like the idea of moving all CPU runtime detection to a single module outside of implementations. This makes it easy to extend in the future and keeps the functionalities separate.
- Rename the CRC choose*.c files to pg_cpucap_{x86,arm}.c and build them
unconditionally for each platform+1. Sounds perfect. We should also move the avx512 runtime detection of popcount here.
[1]: https://stackoverflow.com/questions/1113409/attribute-constructor-equivalent-in-vc
[2]: /messages/by-id/CA+hUKGKS64zJezV9y9mPcB-J0i+fLGiv3FAdwSH_3SCaVdrjyQ@mail.gmail.com
--
John Naylor
Amazon Web Services
Now, there is no equivalent on MSVC and workarounds are fragile [1].
Maybe we could only assert initialization happened for the backend and for
frontend either
- add a couple manual initializations to to the frontend programs where we don't
want to lose performance for non-gcc/clang.
- require CRC on x86-64 MSVC since Windows 10 is EOL soon, going by Thomas
M.'s earlier findings on popcount (also SSE4.2) [2]The first is less risky but less tidy.
Agree, let me think about this but not sure if I have any useful suggestions here. MSVC is unfortunately not my strong suit :/
Raghuveer
On Fri, Feb 21, 2025 at 1:24 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
Now, there is no equivalent on MSVC and workarounds are fragile [1].
Maybe we could only assert initialization happened for the backend and for
frontend either
- add a couple manual initializations to to the frontend programs where we don't
want to lose performance for non-gcc/clang.
- require CRC on x86-64 MSVC since Windows 10 is EOL soon, going by Thomas
M.'s earlier findings on popcount (also SSE4.2) [2]The first is less risky but less tidy.
Agree, let me think about this but not sure if I have any useful suggestions here. MSVC is unfortunately not my strong suit :/
Here's another idea to make it more automatic: Give up on initializing
every capability at once. The first time we call CRC, it will be
uninitialized, so this part:
if (pg_cpucap & PGCPUCAP_CRC32C)
return COMP_CRC32C_HW(crc, data, len);
else
return pg_comp_crc32c_sb8(crc, data, len);
...will call the SB8 path. Inside there, do the check:
#if defined(HAVE_CRC_RUNTIME)
// separate init bit for each capability
if (unlikely(pg_cpucap & PGCPUCAP_CRC32C_INIT == 0))
{
pg_cpucap_crc32c(); // also sets PGCPUCAP_CRC32C_INIT
if (pg_cpucap & PGCPUCAP_CRC32C)
return COMP_CRC32C_HW(crc, data, len);
}
#endif
// ...fallthrough to SB8
--
John Naylor
Amazon Web Services
Here's another idea to make it more automatic: Give up on initializing every
capability at once.
I'm not sure I like giving up this. Initializing and running CPUID check with the attribute constructor is very valuable for two reasons: (1) you get everything done at load time before main and (2) you don’t have to run cpuid check for every feature (popcount, crc32c, or anything else you add in the future) multiple times. It keep the cpuid functionality in a central place that makes it a modular design.
On MSVC, we could have the first SIMD feature call pg_cpucap_initialize() which runs CPUID stores the cpu features. Any subsequent call can skip (because it has already been initialized) by using a static variable or some other approach. Does this make sense?
Raghuveer
On Tue, Feb 25, 2025 at 3:17 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
Here's another idea to make it more automatic: Give up on initializing every
capability at once.I'm not sure I like giving up this. Initializing and running CPUID check with the attribute constructor is very valuable for two reasons: (1) you get everything done at load time before main and (2) you don’t have to run cpuid check for every feature (popcount, crc32c, or anything else you add in the future) multiple times. It keep the cpuid functionality in a central place that makes it a modular design.
I agree it would be preferable to make a centralized check work.
On MSVC, we could have the first SIMD feature call pg_cpucap_initialize() which runs CPUID stores the cpu features. Any subsequent call can skip (because it has already been initialized) by using a static variable or some other approach. Does this make sense?
Correct me if I'm misunderstanding, but this sounds like in every
frontend program we'd need to know what the first call was, which
seems less maintainable than just initializing at the start of every
frontend program.
--
John Naylor
Amazon Web Services
On Tue, Feb 18, 2025 at 1:40 PM John Naylor <johncnaylorls@gmail.com> wrote:
On Tue, Feb 18, 2025 at 12:41 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
While this needn't block this patch set, I do find the dispatch code to be
pretty complicated. Maybe we can improve that in the future by using
macros to auto-generate much of it. My concern here is less about this
particular patch set and more about the long term maintainability as we add
more and more stuff like it, each with its own tangled web of build and
dispatch rules.
I had a further thought on this: CRC and non-vector popcount are kind
of special in that recent OSes assume they exist, and it's worth a bit
of effort to take advantage of that. Other things we may add should be
kept as simple as possible.
- Rename the CRC choose*.c files to pg_cpucap_{x86,arm}.c and build
them unconditionally for each platform
- Initialize the runtime info by CPU platform and not other symbols
where possible (I guess anything needing AVX-512 will still be a mess)
I've made a start of this for v8:
0001 is mostly the same as before
0002 (Meson-only for now) changes 0001 per the above, to see how it
looks, but I've not tried to add popcount or anything else. I like it
overall, but some details may need tweaking.
0004 generates the pclmul loop slightly differently to simplify
integrating with our code, but shouldn't make a big difference
Another thing I found in Agner's manuals: AMD Zen, even as recently as
Zen 4, don't have as good a microarchitecture for PCLMUL, so if anyone
with such a machine would like to help test the cutoff, the script is
at
/messages/by-id/CANWCAZahvhE-+htZiUyzPiS5e45ukx5877mD-dHr-KSX6LcdjQ@mail.gmail.com
(needs "CREATE EXTENSION test_crc32c;" to run it)
--
John Naylor
Amazon Web Services
Attachments:
v8-0001-Dispatch-CRC-computation-by-branching-rather-than.patchtext/x-patch; charset=US-ASCII; name=v8-0001-Dispatch-CRC-computation-by-branching-rather-than.patchDownload
From d704f3f76ba555e0c0ad8c3cfc2d953ea4baa162 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Sat, 15 Feb 2025 19:18:16 +0700
Subject: [PATCH v8 1/4] Dispatch CRC computation by branching rather than
indirect calls
---
src/backend/postmaster/postmaster.c | 4 ++
src/include/port/pg_cpucap.h | 25 +++++++++
src/include/port/pg_crc32c.h | 78 +++++++++++++++++++++--------
src/port/Makefile | 1 +
src/port/meson.build | 4 ++
src/port/pg_cpucap.c | 51 +++++++++++++++++++
src/port/pg_crc32c_armv8_choose.c | 26 +---------
src/port/pg_crc32c_sse42_choose.c | 26 +---------
8 files changed, 145 insertions(+), 70 deletions(-)
create mode 100644 src/include/port/pg_cpucap.h
create mode 100644 src/port/pg_cpucap.c
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 5dd3b6a4fd4..43e35f8041f 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -99,6 +99,7 @@
#include "pg_getopt.h"
#include "pgstat.h"
#include "port/pg_bswap.h"
+#include "port/pg_cpucap.h"
#include "postmaster/autovacuum.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/pgarch.h"
@@ -1951,6 +1952,9 @@ InitProcessGlobals(void)
#ifndef WIN32
srandom(pg_prng_uint32(&pg_global_prng_state));
#endif
+
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
}
/*
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
new file mode 100644
index 00000000000..81edfedce5d
--- /dev/null
+++ b/src/include/port/pg_cpucap.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpucap.h
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/include/port/pg_cpucap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CPUCAP_H
+#define PG_CPUCAP_H
+
+#define PGCPUCAP_INIT (1 << 0)
+#define PGCPUCAP_POPCNT (1 << 1)
+#define PGCPUCAP_VPOPCNT (1 << 2)
+#define PGCPUCAP_CRC32C (1 << 3)
+
+extern PGDLLIMPORT uint32 pg_cpucap;
+extern void pg_cpucap_initialize(void);
+
+#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b1..b565a0f2949 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -34,6 +34,7 @@
#define PG_CRC32C_H
#include "port/pg_bswap.h"
+#include "port/pg_cpucap.h"
typedef uint32 pg_crc32c;
@@ -41,52 +42,55 @@ typedef uint32 pg_crc32c;
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
-#if defined(USE_SSE42_CRC32C)
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
/* Use Intel SSE4.2 instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_SSE42_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_ARMV8_CRC32C)
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/* Use ARMv8 CRC Extension instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_armv8((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_ARMV8_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
/* Use LoongArch CRCC instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_loongarch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#define HAVE_CRC_COMPTIME
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
-
-/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
- * to check that they are available.
- */
-#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c((crc), (data), (len)))
-#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
-
-extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
-extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
-
#else
/*
* Use slicing-by-8 algorithm.
@@ -105,6 +109,36 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif /* end of CPU-specfic symbols */
+
+#if defined(HAVE_CRC_COMPTIME) || defined(HAVE_CRC_RUNTIME)
+/*
+ * Check if the CPU we're running on supports special
+ * instructions for CRC-32C computation. Otherwise, fall
+ * back to the pure software implementation (slicing-by-8).
+ */
+static inline pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ /*
+ * If this is firing in a frontend program, first look if you forgot a
+ * call to pg_cpucap_initialize() in main(). See for example
+ * src/bin/pg_controldata/pg_controldata.c.
+ */
+ // WIP: how to best intialize in frontend?
+#ifndef FRONTEND
+ Assert(pg_cpucap & PGCPUCAP_INIT);
+#endif
+
+#if defined(HAVE_CRC_COMPTIME)
+ return COMP_CRC32C_HW(crc, data, len);
+#else
+ if (pg_cpucap & PGCPUCAP_CRC32C)
+ return COMP_CRC32C_HW(crc, data, len);
+ else
+ return pg_comp_crc32c_sb8(crc, data, len);
#endif
+}
+#endif /* HAVE_CRC_COMPTIME || HAVE_CRC_RUNTIME */
#endif /* PG_CRC32C_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index 4c224319512..5a05179e926 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -44,6 +44,7 @@ OBJS = \
noblock.o \
path.o \
pg_bitutils.o \
+ pg_cpucap.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d43..e1e7ce8fb87 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -7,6 +7,7 @@ pgport_sources = [
'noblock.c',
'path.c',
'pg_bitutils.c',
+ 'pg_cpucap.c',
'pg_popcount_avx512.c',
'pg_strong_random.c',
'pgcheckdir.c',
@@ -83,12 +84,15 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ # WIP sometime we'll need to build these based on host_cpu
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
new file mode 100644
index 00000000000..eba6e31c63f
--- /dev/null
+++ b/src/port/pg_cpucap.c
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpucap.c
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/port/pg_cpucap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+
+#include "port/pg_cpucap.h"
+#include "port/pg_crc32c.h"
+
+
+/* starts uninitialized so we can detect errors of omission */
+uint32 pg_cpucap = 0;
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+static void
+pg_cpucap_crc32c(void)
+{
+ /* WIP: It seems like we should use CPU arch symbols instead */
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+#endif
+}
+
+/*
+ * This needs to be called in main() for every
+ * program that calls a function that dispatches
+ * according to CPU features.
+ */
+void
+pg_cpucap_initialize(void)
+{
+ pg_cpucap = PGCPUCAP_INIT;
+
+ pg_cpucap_crc32c();
+}
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_crc32c_armv8_choose.c
index ec12be1bbc3..e3654427c3f 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_crc32c_armv8_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_armv8_choose.c
- * Choose between ARMv8 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports the ARMv8
- * CRC Extension. If it does, use the special instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -40,7 +35,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -106,20 +101,3 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_armv8_available())
- pg_comp_crc32c = pg_comp_crc32c_armv8;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d4249..f4d3215bc55 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_sse42_choose.c
- * Choose between Intel SSE 4.2 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports Intel SSE
- * 4.2. If it does, use the special SSE instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -30,7 +25,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -45,20 +40,3 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
- pg_comp_crc32c = pg_comp_crc32c_sse42;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
--
2.48.1
v8-0003-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchtext/x-patch; charset=US-ASCII; name=v8-0003-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From 49d1cec52cf2f167e20d916330bb7d37b2f8af34 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v8 3/4] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67a..06673db0625 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 00000000000..5b747c6184a
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 00000000000..dff6bb3133b
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 00000000000..d7bec4ba1cb
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 00000000000..95c6dfe4488
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 00000000000..52b9772f908
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 00000000000..28bc42de314
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 00000000000..878a077ee18
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
v8-0004-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v8-0004-Improve-CRC32C-performance-on-x86_64.patchDownload
From d5ff7ff575cb9b005ff559903a0e1ffe0e023cf4 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v8 4/4] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
src/include/port/pg_cpucap.h | 2 +
src/port/pg_cpucap.c | 1 +
src/port/pg_cpucap_arm.c | 6 ++
src/port/pg_cpucap_x86.c | 23 +++++
src/port/pg_crc32c_sse42.c | 123 ++++++++++++++++++++++++++
src/test/regress/expected/strings.out | 24 +++++
src/test/regress/sql/strings.sql | 4 +
7 files changed, 183 insertions(+)
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index 5e04213b211..af3fabfcffb 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -18,11 +18,13 @@
#define PGCPUCAP_POPCNT (1 << 1)
#define PGCPUCAP_VPOPCNT (1 << 2)
#define PGCPUCAP_CRC32C (1 << 3)
+#define PGCPUCAP_CLMUL (1 << 4)
extern PGDLLIMPORT uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
/* arch-specific functions private to src/port */
extern void pg_cpucap_crc32c(void);
+extern void pg_cpucap_clmul(void);
#endif /* PG_CPUCAP_H */
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index 88d75827022..301bd9fc2c7 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -30,4 +30,5 @@ pg_cpucap_initialize(void)
pg_cpucap = PGCPUCAP_INIT;
pg_cpucap_crc32c();
+ pg_cpucap_clmul();
}
diff --git a/src/port/pg_cpucap_arm.c b/src/port/pg_cpucap_arm.c
index 19e052fecf6..e080a5a931f 100644
--- a/src/port/pg_cpucap_arm.c
+++ b/src/port/pg_cpucap_arm.c
@@ -111,3 +111,9 @@ pg_cpucap_crc32c(void)
if (pg_crc32c_armv8_available())
pg_cpucap |= PGCPUCAP_CRC32C;
}
+
+void
+pg_cpucap_clmul(void)
+{
+ // WIP: does this even make sense?
+}
diff --git a/src/port/pg_cpucap_x86.c b/src/port/pg_cpucap_x86.c
index 07462bd1d2a..3a62a3a582f 100644
--- a/src/port/pg_cpucap_x86.c
+++ b/src/port/pg_cpucap_x86.c
@@ -41,6 +41,22 @@ pg_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+static bool
+pg_pclmul_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+ return (exx[2] & (1 << 1)) != 0; /* PCLMUL */
+}
+
/*
* Check if hardware instructions for CRC computation are available.
*/
@@ -50,3 +66,10 @@ pg_cpucap_crc32c(void)
if (pg_sse42_available())
pg_cpucap |= PGCPUCAP_CRC32C;
}
+
+void
+pg_cpucap_clmul(void)
+{
+ if (pg_pclmul_available())
+ pg_cpucap |= PGCPUCAP_CLMUL;
+}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df31..fc3cf0d0882 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,9 +15,118 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define HAVE_PCLMUL_RUNTIME
+#endif
+
+ /*
+ * WIP: Testing has shown that on Kaby Lake (2016) this algorithm needs two
+ * iterations of the main loop to be faster than using regular CRC
+ * instrutions, but Tiger Lake (2020) is fine with a single iteration. Could
+ * use more testing between those years (on AMD as well).
+ */
+#define PCLMUL_THRESHOLD 128
+
+#ifdef HAVE_PCLMUL_RUNTIME
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+static pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ // This prolog is trying to avoid loads straddling
+ // cache lines, but it doesn't seem worth it if
+ // we're trying to be fast on small inputs as well
+#if 0
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t) buf & 8) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+#endif
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ len = end - buf;
+ }
+ for (; len >= 8; buf += 8, len -= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ }
+ for (; len; --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+
+ return crc0;
+}
+
+#endif
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
@@ -26,6 +135,17 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
const unsigned char *p = data;
const unsigned char *pend = p + len;
+ /* XXX not for commit */
+ const pg_crc32c orig_crc PG_USED_FOR_ASSERTS_ONLY = crc;
+ const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
+
+#ifdef HAVE_PCLMUL_RUNTIME
+ if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
+ {
+ return pg_comp_crc32c_pclmul(crc, data, len);
+ }
+#endif
+
/*
* Process eight bytes of data at a time.
*
@@ -66,5 +186,8 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
p++;
}
+ /* XXX not for commit */
+ Assert(crc == pg_comp_crc32c_sb8(orig_crc, data, orig_len));
+
return crc;
}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b65bb2d5368..662bd37ace6 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2282,6 +2282,30 @@ SELECT crc32c('The quick brown fox jumps over the lazy dog.');
419469235
(1 row)
+SELECT crc32c(repeat('A', 80)::bytea);
+ crc32c
+------------
+ 3799127650
+(1 row)
+
+SELECT crc32c(repeat('A', 127)::bytea);
+ crc32c
+-----------
+ 291820082
+(1 row)
+
+SELECT crc32c(repeat('A', 128)::bytea);
+ crc32c
+-----------
+ 816091258
+(1 row)
+
+SELECT crc32c(repeat('A', 129)::bytea);
+ crc32c
+------------
+ 4213642571
+(1 row)
+
--
-- encode/decode
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 8e0f3a0e75f..26f86dc92e0 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -727,6 +727,10 @@ SELECT crc32('The quick brown fox jumps over the lazy dog.');
SELECT crc32c('');
SELECT crc32c('The quick brown fox jumps over the lazy dog.');
+SELECT crc32c(repeat('A', 80)::bytea);
+SELECT crc32c(repeat('A', 127)::bytea);
+SELECT crc32c(repeat('A', 128)::bytea);
+SELECT crc32c(repeat('A', 129)::bytea);
--
-- encode/decode
--
2.48.1
v8-0002-Rename-CRC-choose-files-to-cpucap-files.patchtext/x-patch; charset=US-ASCII; name=v8-0002-Rename-CRC-choose-files-to-cpucap-files.patchDownload
From 6ab9d5854cdd0247507d2f20cb8a21783547a798 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 25 Feb 2025 13:59:21 +0700
Subject: [PATCH v8 2/4] Rename CRC *choose files to cpucap* files
On Meson, build them unconditionally on the relevant arch.
FIXME autoconf builds are broken
---
configure | 4 ++--
configure.ac | 4 ++--
src/include/port/pg_cpucap.h | 3 +++
src/include/port/pg_crc32c.h | 2 --
src/port/Makefile | 2 ++
src/port/meson.build | 17 +++++++++++-----
src/port/pg_cpucap.c | 18 -----------------
..._crc32c_armv8_choose.c => pg_cpucap_arm.c} | 16 ++++++++++++---
..._crc32c_sse42_choose.c => pg_cpucap_x86.c} | 20 ++++++++++++++-----
9 files changed, 49 insertions(+), 37 deletions(-)
rename src/port/{pg_crc32c_armv8_choose.c => pg_cpucap_arm.c} (92%)
rename src/port/{pg_crc32c_sse42_choose.c => pg_cpucap_x86.c} (73%)
diff --git a/configure b/configure
index 93fddd69981..5e686793c16 100755
--- a/configure
+++ b/configure
@@ -17692,7 +17692,7 @@ else
$as_echo "#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime check" >&5
$as_echo "SSE 4.2 with runtime check" >&6; }
else
@@ -17708,7 +17708,7 @@ $as_echo "ARMv8 CRC instructions" >&6; }
$as_echo "#define USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions with runtime check" >&5
$as_echo "ARMv8 CRC instructions with runtime check" >&6; }
else
diff --git a/configure.ac b/configure.ac
index b6d02f5ecc7..056b406f117 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2156,7 +2156,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
AC_MSG_RESULT(SSE 4.2 with runtime check)
else
if test x"$USE_ARMV8_CRC32C" = x"1"; then
@@ -2166,7 +2166,7 @@ else
else
if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use ARMv8 CRC Extension with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
AC_MSG_RESULT(ARMv8 CRC instructions with runtime check)
else
if test x"$USE_LOONGARCH_CRC32C" = x"1"; then
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index 81edfedce5d..5e04213b211 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -22,4 +22,7 @@
extern PGDLLIMPORT uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
+/* arch-specific functions private to src/port */
+extern void pg_cpucap_crc32c(void);
+
#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index b565a0f2949..4f0ebb9923c 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -57,7 +57,6 @@ typedef uint32 pg_crc32c;
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
#endif
-extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
@@ -76,7 +75,6 @@ extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
#endif
-extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
diff --git a/src/port/Makefile b/src/port/Makefile
index 5a05179e926..1fc03713b31 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -45,6 +45,8 @@ OBJS = \
path.o \
pg_bitutils.o \
pg_cpucap.o \
+ pg_cpucap_x86.o \
+ pg_cpucap_arm.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index e1e7ce8fb87..baa8e16200d 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -78,22 +78,29 @@ if host_system != 'windows'
replace_funcs_neg += [['pthread_barrier_wait']]
endif
+# arch-specific runtime checks
+if host_cpu == 'x86' or host_cpu == 'x86_64'
+ pgport_sources += files(
+ 'pg_cpucap_x86.c'
+ )
+
+elif host_cpu == 'arm' or host_cpu == 'aarch64'
+ pgport_sources += files(
+ 'pg_cpucap_arm.c'
+ )
+endif
+
# Replacement functionality to be built if corresponding configure symbol
# is true
replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
- # WIP sometime we'll need to build these based on host_cpu
- ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
- ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
- ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
- ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
# loongarch
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index eba6e31c63f..88d75827022 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -14,29 +14,11 @@
#include "c.h"
#include "port/pg_cpucap.h"
-#include "port/pg_crc32c.h"
/* starts uninitialized so we can detect errors of omission */
uint32 pg_cpucap = 0;
-/*
- * Check if hardware instructions for CRC computation are available.
- */
-static void
-pg_cpucap_crc32c(void)
-{
- /* WIP: It seems like we should use CPU arch symbols instead */
-#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
- if (pg_crc32c_sse42_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-
-#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
- if (pg_crc32c_armv8_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-#endif
-}
-
/*
* This needs to be called in main() for every
* program that calls a function that dispatches
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_cpucap_arm.c
similarity index 92%
rename from src/port/pg_crc32c_armv8_choose.c
rename to src/port/pg_cpucap_arm.c
index e3654427c3f..19e052fecf6 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_cpucap_arm.c
@@ -1,6 +1,6 @@
/*-------------------------------------------------------------------------
*
- * pg_crc32c_armv8_choose.c
+ * pg_cpucap_arm.c
* Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
- * src/port/pg_crc32c_armv8_choose.c
+ * src/port/pg_cpucap_arm.c
*
*-------------------------------------------------------------------------
*/
@@ -35,7 +35,7 @@
#include "port/pg_crc32c.h"
-bool
+static bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -101,3 +101,13 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+void
+pg_cpucap_crc32c(void)
+{
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+}
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_cpucap_x86.c
similarity index 73%
rename from src/port/pg_crc32c_sse42_choose.c
rename to src/port/pg_cpucap_x86.c
index f4d3215bc55..07462bd1d2a 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_cpucap_x86.c
@@ -1,6 +1,6 @@
/*-------------------------------------------------------------------------
*
- * pg_crc32c_sse42_choose.c
+ * pg_cpucap_x86.c
* Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
- * src/port/pg_crc32c_sse42_choose.c
+ * src/port/pg_cpucap_x86.c
*
*-------------------------------------------------------------------------
*/
@@ -23,10 +23,10 @@
#include <intrin.h>
#endif
-#include "port/pg_crc32c.h"
+#include "port/pg_cpucap.h"
-bool
-pg_crc32c_sse42_available(void)
+static bool
+pg_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -40,3 +40,13 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+void
+pg_cpucap_crc32c(void)
+{
+ if (pg_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+}
--
2.48.1
I agree it would be preferable to make a centralized check work.
Here is my first stab at it. v9 is same as v8 + a commit to move all cpuid checks into one single place including the AVX512 popcount code. Any new feature that requires CPUID information can access that information with pg_cpu_have[FEATURE] defined in pg_cpucap.h and initialized in pg_cpucap.c. v8 also had a typo in configure files which caused a build failure. Its fixed in v9.
Pretty sure the ARM code paths need some correction. Let me know what you think.
Correct me if I'm misunderstanding, but this sounds like in every frontend
program we'd need to know what the first call was, which seems less
maintainable than just initializing at the start of every frontend program.
No, sorry for the confusion but that is not what I meant. Lets ignore the attribute constructor for now. We can probably revisit this at a later point.
Raghuveer
Attachments:
v9-0001-Dispatch-CRC-computation-by-branching-rather-than.patchapplication/octet-stream; name=v9-0001-Dispatch-CRC-computation-by-branching-rather-than.patchDownload
From 2a8a44c7fe8cfed6c7298533d633688cd2efd0b3 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Sat, 15 Feb 2025 19:18:16 +0700
Subject: [PATCH v9 1/5] Dispatch CRC computation by branching rather than
indirect calls
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
src/backend/postmaster/postmaster.c | 4 ++
src/include/port/pg_cpucap.h | 25 +++++++++
src/include/port/pg_crc32c.h | 78 +++++++++++++++++++++--------
src/port/Makefile | 1 +
src/port/meson.build | 4 ++
src/port/pg_cpucap.c | 51 +++++++++++++++++++
src/port/pg_crc32c_armv8_choose.c | 26 +---------
src/port/pg_crc32c_sse42_choose.c | 26 +---------
8 files changed, 145 insertions(+), 70 deletions(-)
create mode 100644 src/include/port/pg_cpucap.h
create mode 100644 src/port/pg_cpucap.c
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bb22b13ade..4fa95f1d2c 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -99,6 +99,7 @@
#include "pg_getopt.h"
#include "pgstat.h"
#include "port/pg_bswap.h"
+#include "port/pg_cpucap.h"
#include "postmaster/autovacuum.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/pgarch.h"
@@ -1951,6 +1952,9 @@ InitProcessGlobals(void)
#ifndef WIN32
srandom(pg_prng_uint32(&pg_global_prng_state));
#endif
+
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
}
/*
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
new file mode 100644
index 0000000000..81edfedce5
--- /dev/null
+++ b/src/include/port/pg_cpucap.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpucap.h
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/include/port/pg_cpucap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CPUCAP_H
+#define PG_CPUCAP_H
+
+#define PGCPUCAP_INIT (1 << 0)
+#define PGCPUCAP_POPCNT (1 << 1)
+#define PGCPUCAP_VPOPCNT (1 << 2)
+#define PGCPUCAP_CRC32C (1 << 3)
+
+extern PGDLLIMPORT uint32 pg_cpucap;
+extern void pg_cpucap_initialize(void);
+
+#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b..b565a0f294 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -34,6 +34,7 @@
#define PG_CRC32C_H
#include "port/pg_bswap.h"
+#include "port/pg_cpucap.h"
typedef uint32 pg_crc32c;
@@ -41,52 +42,55 @@ typedef uint32 pg_crc32c;
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
-#if defined(USE_SSE42_CRC32C)
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
/* Use Intel SSE4.2 instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_SSE42_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_ARMV8_CRC32C)
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/* Use ARMv8 CRC Extension instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_armv8((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_ARMV8_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
/* Use LoongArch CRCC instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_loongarch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#define HAVE_CRC_COMPTIME
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
-
-/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
- * to check that they are available.
- */
-#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c((crc), (data), (len)))
-#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
-
-extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
-extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
-
#else
/*
* Use slicing-by-8 algorithm.
@@ -105,6 +109,36 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif /* end of CPU-specfic symbols */
+
+#if defined(HAVE_CRC_COMPTIME) || defined(HAVE_CRC_RUNTIME)
+/*
+ * Check if the CPU we're running on supports special
+ * instructions for CRC-32C computation. Otherwise, fall
+ * back to the pure software implementation (slicing-by-8).
+ */
+static inline pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ /*
+ * If this is firing in a frontend program, first look if you forgot a
+ * call to pg_cpucap_initialize() in main(). See for example
+ * src/bin/pg_controldata/pg_controldata.c.
+ */
+ // WIP: how to best intialize in frontend?
+#ifndef FRONTEND
+ Assert(pg_cpucap & PGCPUCAP_INIT);
+#endif
+
+#if defined(HAVE_CRC_COMPTIME)
+ return COMP_CRC32C_HW(crc, data, len);
+#else
+ if (pg_cpucap & PGCPUCAP_CRC32C)
+ return COMP_CRC32C_HW(crc, data, len);
+ else
+ return pg_comp_crc32c_sb8(crc, data, len);
#endif
+}
+#endif /* HAVE_CRC_COMPTIME || HAVE_CRC_RUNTIME */
#endif /* PG_CRC32C_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index 4c22431951..5a05179e92 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -44,6 +44,7 @@ OBJS = \
noblock.o \
path.o \
pg_bitutils.o \
+ pg_cpucap.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d4..e1e7ce8fb8 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -7,6 +7,7 @@ pgport_sources = [
'noblock.c',
'path.c',
'pg_bitutils.c',
+ 'pg_cpucap.c',
'pg_popcount_avx512.c',
'pg_strong_random.c',
'pgcheckdir.c',
@@ -83,12 +84,15 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ # WIP sometime we'll need to build these based on host_cpu
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
new file mode 100644
index 0000000000..eba6e31c63
--- /dev/null
+++ b/src/port/pg_cpucap.c
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpucap.c
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/port/pg_cpucap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+
+#include "port/pg_cpucap.h"
+#include "port/pg_crc32c.h"
+
+
+/* starts uninitialized so we can detect errors of omission */
+uint32 pg_cpucap = 0;
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+static void
+pg_cpucap_crc32c(void)
+{
+ /* WIP: It seems like we should use CPU arch symbols instead */
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+#endif
+}
+
+/*
+ * This needs to be called in main() for every
+ * program that calls a function that dispatches
+ * according to CPU features.
+ */
+void
+pg_cpucap_initialize(void)
+{
+ pg_cpucap = PGCPUCAP_INIT;
+
+ pg_cpucap_crc32c();
+}
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_crc32c_armv8_choose.c
index ec12be1bbc..e3654427c3 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_crc32c_armv8_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_armv8_choose.c
- * Choose between ARMv8 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports the ARMv8
- * CRC Extension. If it does, use the special instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -40,7 +35,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -106,20 +101,3 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_armv8_available())
- pg_comp_crc32c = pg_comp_crc32c_armv8;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d424..f4d3215bc5 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_sse42_choose.c
- * Choose between Intel SSE 4.2 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports Intel SSE
- * 4.2. If it does, use the special SSE instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -30,7 +25,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -45,20 +40,3 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
- pg_comp_crc32c = pg_comp_crc32c_sse42;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
--
2.43.0
v9-0002-Rename-CRC-choose-files-to-cpucap-files.patchapplication/octet-stream; name=v9-0002-Rename-CRC-choose-files-to-cpucap-files.patchDownload
From bf13997ab601d8a91bc80e0e8cd11159f9c1eb25 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 25 Feb 2025 13:59:21 +0700
Subject: [PATCH v9 2/5] Rename CRC *choose files to cpucap* files
On Meson, build them unconditionally on the relevant arch.
FIXME autoconf builds are broken
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
configure | 4 ++--
configure.ac | 4 ++--
src/include/port/pg_cpucap.h | 3 +++
src/include/port/pg_crc32c.h | 2 --
src/port/Makefile | 2 ++
src/port/meson.build | 17 +++++++++++-----
src/port/pg_cpucap.c | 18 -----------------
..._crc32c_armv8_choose.c => pg_cpucap_arm.c} | 16 ++++++++++++---
..._crc32c_sse42_choose.c => pg_cpucap_x86.c} | 20 ++++++++++++++-----
9 files changed, 49 insertions(+), 37 deletions(-)
rename src/port/{pg_crc32c_armv8_choose.c => pg_cpucap_arm.c} (92%)
rename src/port/{pg_crc32c_sse42_choose.c => pg_cpucap_x86.c} (73%)
diff --git a/configure b/configure
index 0ffcaeb436..0d31e6a236 100755
--- a/configure
+++ b/configure
@@ -17360,7 +17360,7 @@ else
$as_echo "#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime check" >&5
$as_echo "SSE 4.2 with runtime check" >&6; }
else
@@ -17376,7 +17376,7 @@ $as_echo "ARMv8 CRC instructions" >&6; }
$as_echo "#define USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions with runtime check" >&5
$as_echo "ARMv8 CRC instructions with runtime check" >&6; }
else
diff --git a/configure.ac b/configure.ac
index f56681e0d9..60d30f855d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2115,7 +2115,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
AC_MSG_RESULT(SSE 4.2 with runtime check)
else
if test x"$USE_ARMV8_CRC32C" = x"1"; then
@@ -2125,7 +2125,7 @@ else
else
if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use ARMv8 CRC Extension with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
AC_MSG_RESULT(ARMv8 CRC instructions with runtime check)
else
if test x"$USE_LOONGARCH_CRC32C" = x"1"; then
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index 81edfedce5..5e04213b21 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -22,4 +22,7 @@
extern PGDLLIMPORT uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
+/* arch-specific functions private to src/port */
+extern void pg_cpucap_crc32c(void);
+
#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index b565a0f294..4f0ebb9923 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -57,7 +57,6 @@ typedef uint32 pg_crc32c;
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
#endif
-extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
@@ -76,7 +75,6 @@ extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
#endif
-extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
diff --git a/src/port/Makefile b/src/port/Makefile
index 5a05179e92..1fc03713b3 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -45,6 +45,8 @@ OBJS = \
path.o \
pg_bitutils.o \
pg_cpucap.o \
+ pg_cpucap_x86.o \
+ pg_cpucap_arm.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index e1e7ce8fb8..baa8e16200 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -78,22 +78,29 @@ if host_system != 'windows'
replace_funcs_neg += [['pthread_barrier_wait']]
endif
+# arch-specific runtime checks
+if host_cpu == 'x86' or host_cpu == 'x86_64'
+ pgport_sources += files(
+ 'pg_cpucap_x86.c'
+ )
+
+elif host_cpu == 'arm' or host_cpu == 'aarch64'
+ pgport_sources += files(
+ 'pg_cpucap_arm.c'
+ )
+endif
+
# Replacement functionality to be built if corresponding configure symbol
# is true
replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
- # WIP sometime we'll need to build these based on host_cpu
- ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
- ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
- ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
- ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
# loongarch
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index eba6e31c63..88d7582702 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -14,29 +14,11 @@
#include "c.h"
#include "port/pg_cpucap.h"
-#include "port/pg_crc32c.h"
/* starts uninitialized so we can detect errors of omission */
uint32 pg_cpucap = 0;
-/*
- * Check if hardware instructions for CRC computation are available.
- */
-static void
-pg_cpucap_crc32c(void)
-{
- /* WIP: It seems like we should use CPU arch symbols instead */
-#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
- if (pg_crc32c_sse42_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-
-#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
- if (pg_crc32c_armv8_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-#endif
-}
-
/*
* This needs to be called in main() for every
* program that calls a function that dispatches
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_cpucap_arm.c
similarity index 92%
rename from src/port/pg_crc32c_armv8_choose.c
rename to src/port/pg_cpucap_arm.c
index e3654427c3..19e052fecf 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_cpucap_arm.c
@@ -1,6 +1,6 @@
/*-------------------------------------------------------------------------
*
- * pg_crc32c_armv8_choose.c
+ * pg_cpucap_arm.c
* Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
- * src/port/pg_crc32c_armv8_choose.c
+ * src/port/pg_cpucap_arm.c
*
*-------------------------------------------------------------------------
*/
@@ -35,7 +35,7 @@
#include "port/pg_crc32c.h"
-bool
+static bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -101,3 +101,13 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+void
+pg_cpucap_crc32c(void)
+{
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+}
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_cpucap_x86.c
similarity index 73%
rename from src/port/pg_crc32c_sse42_choose.c
rename to src/port/pg_cpucap_x86.c
index f4d3215bc5..07462bd1d2 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_cpucap_x86.c
@@ -1,6 +1,6 @@
/*-------------------------------------------------------------------------
*
- * pg_crc32c_sse42_choose.c
+ * pg_cpucap_x86.c
* Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
- * src/port/pg_crc32c_sse42_choose.c
+ * src/port/pg_cpucap_x86.c
*
*-------------------------------------------------------------------------
*/
@@ -23,10 +23,10 @@
#include <intrin.h>
#endif
-#include "port/pg_crc32c.h"
+#include "port/pg_cpucap.h"
-bool
-pg_crc32c_sse42_available(void)
+static bool
+pg_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -40,3 +40,13 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+void
+pg_cpucap_crc32c(void)
+{
+ if (pg_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+}
--
2.43.0
v9-0003-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchapplication/octet-stream; name=v9-0003-Add-a-Postgres-SQL-function-for-crc32c-benchmarki.patchDownload
From a2cce7a4f4b1d6064e65ae47942d0609e758d706 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v9 3/5] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67..06673db062 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 0000000000..5b747c6184
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 0000000000..dff6bb3133
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 0000000000..d7bec4ba1c
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 0000000000..95c6dfe448
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 0000000000..52b9772f90
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 0000000000..28bc42de31
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 0000000000..878a077ee1
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.43.0
v9-0004-Improve-CRC32C-performance-on-x86_64.patchapplication/octet-stream; name=v9-0004-Improve-CRC32C-performance-on-x86_64.patchDownload
From ea97c1e30605473cd73e5f3ffe2ac966b9f0c180 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v9 4/5] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
src/include/port/pg_cpucap.h | 2 +
src/port/pg_cpucap.c | 1 +
src/port/pg_cpucap_arm.c | 6 ++
src/port/pg_cpucap_x86.c | 23 +++++
src/port/pg_crc32c_sse42.c | 123 ++++++++++++++++++++++++++
src/test/regress/expected/strings.out | 24 +++++
src/test/regress/sql/strings.sql | 4 +
7 files changed, 183 insertions(+)
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index 5e04213b21..af3fabfcff 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -18,11 +18,13 @@
#define PGCPUCAP_POPCNT (1 << 1)
#define PGCPUCAP_VPOPCNT (1 << 2)
#define PGCPUCAP_CRC32C (1 << 3)
+#define PGCPUCAP_CLMUL (1 << 4)
extern PGDLLIMPORT uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
/* arch-specific functions private to src/port */
extern void pg_cpucap_crc32c(void);
+extern void pg_cpucap_clmul(void);
#endif /* PG_CPUCAP_H */
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index 88d7582702..301bd9fc2c 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -30,4 +30,5 @@ pg_cpucap_initialize(void)
pg_cpucap = PGCPUCAP_INIT;
pg_cpucap_crc32c();
+ pg_cpucap_clmul();
}
diff --git a/src/port/pg_cpucap_arm.c b/src/port/pg_cpucap_arm.c
index 19e052fecf..e080a5a931 100644
--- a/src/port/pg_cpucap_arm.c
+++ b/src/port/pg_cpucap_arm.c
@@ -111,3 +111,9 @@ pg_cpucap_crc32c(void)
if (pg_crc32c_armv8_available())
pg_cpucap |= PGCPUCAP_CRC32C;
}
+
+void
+pg_cpucap_clmul(void)
+{
+ // WIP: does this even make sense?
+}
diff --git a/src/port/pg_cpucap_x86.c b/src/port/pg_cpucap_x86.c
index 07462bd1d2..3a62a3a582 100644
--- a/src/port/pg_cpucap_x86.c
+++ b/src/port/pg_cpucap_x86.c
@@ -41,6 +41,22 @@ pg_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+static bool
+pg_pclmul_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+ return (exx[2] & (1 << 1)) != 0; /* PCLMUL */
+}
+
/*
* Check if hardware instructions for CRC computation are available.
*/
@@ -50,3 +66,10 @@ pg_cpucap_crc32c(void)
if (pg_sse42_available())
pg_cpucap |= PGCPUCAP_CRC32C;
}
+
+void
+pg_cpucap_clmul(void)
+{
+ if (pg_pclmul_available())
+ pg_cpucap |= PGCPUCAP_CLMUL;
+}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..fc3cf0d088 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,9 +15,118 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define HAVE_PCLMUL_RUNTIME
+#endif
+
+ /*
+ * WIP: Testing has shown that on Kaby Lake (2016) this algorithm needs two
+ * iterations of the main loop to be faster than using regular CRC
+ * instrutions, but Tiger Lake (2020) is fine with a single iteration. Could
+ * use more testing between those years (on AMD as well).
+ */
+#define PCLMUL_THRESHOLD 128
+
+#ifdef HAVE_PCLMUL_RUNTIME
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+static pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ // This prolog is trying to avoid loads straddling
+ // cache lines, but it doesn't seem worth it if
+ // we're trying to be fast on small inputs as well
+#if 0
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t) buf & 8) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+#endif
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ len = end - buf;
+ }
+ for (; len >= 8; buf += 8, len -= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ }
+ for (; len; --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+
+ return crc0;
+}
+
+#endif
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
@@ -26,6 +135,17 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
const unsigned char *p = data;
const unsigned char *pend = p + len;
+ /* XXX not for commit */
+ const pg_crc32c orig_crc PG_USED_FOR_ASSERTS_ONLY = crc;
+ const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
+
+#ifdef HAVE_PCLMUL_RUNTIME
+ if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
+ {
+ return pg_comp_crc32c_pclmul(crc, data, len);
+ }
+#endif
+
/*
* Process eight bytes of data at a time.
*
@@ -66,5 +186,8 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
p++;
}
+ /* XXX not for commit */
+ Assert(crc == pg_comp_crc32c_sb8(orig_crc, data, orig_len));
+
return crc;
}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b65bb2d536..662bd37ace 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2282,6 +2282,30 @@ SELECT crc32c('The quick brown fox jumps over the lazy dog.');
419469235
(1 row)
+SELECT crc32c(repeat('A', 80)::bytea);
+ crc32c
+------------
+ 3799127650
+(1 row)
+
+SELECT crc32c(repeat('A', 127)::bytea);
+ crc32c
+-----------
+ 291820082
+(1 row)
+
+SELECT crc32c(repeat('A', 128)::bytea);
+ crc32c
+-----------
+ 816091258
+(1 row)
+
+SELECT crc32c(repeat('A', 129)::bytea);
+ crc32c
+------------
+ 4213642571
+(1 row)
+
--
-- encode/decode
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 8e0f3a0e75..26f86dc92e 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -727,6 +727,10 @@ SELECT crc32('The quick brown fox jumps over the lazy dog.');
SELECT crc32c('');
SELECT crc32c('The quick brown fox jumps over the lazy dog.');
+SELECT crc32c(repeat('A', 80)::bytea);
+SELECT crc32c(repeat('A', 127)::bytea);
+SELECT crc32c(repeat('A', 128)::bytea);
+SELECT crc32c(repeat('A', 129)::bytea);
--
-- encode/decode
--
2.43.0
v9-0005-Move-all-cpuid-checks-to-one-location.patchapplication/octet-stream; name=v9-0005-Move-all-cpuid-checks-to-one-location.patchDownload
From aa515c0aed54d753e710d381a2254679cc4410e4 Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 25 Feb 2025 12:54:43 -0800
Subject: [PATCH v9 5/5] Move all cpuid checks to one location
---
configure | 4 +-
configure.ac | 4 +-
src/include/port/pg_cpucap.h | 32 +++++--
src/include/port/pg_crc32c.h | 6 +-
src/port/Makefile | 2 -
src/port/meson.build | 12 ---
src/port/pg_bitutils.c | 20 +---
src/port/pg_cpucap.c | 173 ++++++++++++++++++++++++++++++++--
src/port/pg_cpucap_arm.c | 119 -----------------------
src/port/pg_cpucap_x86.c | 75 ---------------
src/port/pg_crc32c_sse42.c | 3 +-
src/port/pg_popcount_avx512.c | 71 +-------------
12 files changed, 203 insertions(+), 318 deletions(-)
delete mode 100644 src/port/pg_cpucap_arm.c
delete mode 100644 src/port/pg_cpucap_x86.c
diff --git a/configure b/configure
index 0d31e6a236..172c93896c 100755
--- a/configure
+++ b/configure
@@ -17360,7 +17360,7 @@ else
$as_echo "#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime check" >&5
$as_echo "SSE 4.2 with runtime check" >&6; }
else
@@ -17376,7 +17376,7 @@ $as_echo "ARMv8 CRC instructions" >&6; }
$as_echo "#define USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions with runtime check" >&5
$as_echo "ARMv8 CRC instructions with runtime check" >&6; }
else
diff --git a/configure.ac b/configure.ac
index 60d30f855d..c7bfad98ac 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2115,7 +2115,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o"
AC_MSG_RESULT(SSE 4.2 with runtime check)
else
if test x"$USE_ARMV8_CRC32C" = x"1"; then
@@ -2125,7 +2125,7 @@ else
else
if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use ARMv8 CRC Extension with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o"
AC_MSG_RESULT(ARMv8 CRC instructions with runtime check)
else
if test x"$USE_LOONGARCH_CRC32C" = x"1"; then
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index af3fabfcff..d623db43e4 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -14,17 +14,29 @@
#ifndef PG_CPUCAP_H
#define PG_CPUCAP_H
-#define PGCPUCAP_INIT (1 << 0)
-#define PGCPUCAP_POPCNT (1 << 1)
-#define PGCPUCAP_VPOPCNT (1 << 2)
-#define PGCPUCAP_CRC32C (1 << 3)
-#define PGCPUCAP_CLMUL (1 << 4)
+enum pg_cpucap__
+{
+ PG_CPU_FEATURE_INIT = 0,
+ // X86
+ PG_CPU_FEATURE_SSE42 = 1,
+ PG_CPU_FEATURE_POPCNT = 2,
+ PG_CPU_FEATURE_PCLMUL = 3,
-extern PGDLLIMPORT uint32 pg_cpucap;
-extern void pg_cpucap_initialize(void);
+ /* SKX: */
+ PG_CPU_FEATURE_AVX512F = 30,
+ PG_CPU_FEATURE_AVX512BW = 31,
+
+ /* ICX */
+ PG_CPU_FEATURE_AVX512VPOPCNTDQ = 40,
+
+ // ARM
+ PG_CPU_FEATURE_ARMV8_CRC32C = 100,
-/* arch-specific functions private to src/port */
-extern void pg_cpucap_crc32c(void);
-extern void pg_cpucap_clmul(void);
+ PG_CPU_FEATURE_MAX
+};
+
+
+extern void pg_cpucap_initialize(void);
+bool pg_cpu_have(int feature_id);
#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 4f0ebb9923..41b648bea6 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -48,6 +48,7 @@ typedef uint32 pg_crc32c;
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_SSE42)
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
#if defined(USE_SSE42_CRC32C)
@@ -66,6 +67,7 @@ extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t le
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_armv8((crc), (data), (len)))
+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_ARMV8_CRC32C)
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
#if defined(USE_ARMV8_CRC32C)
@@ -125,13 +127,13 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
*/
// WIP: how to best intialize in frontend?
#ifndef FRONTEND
- Assert(pg_cpucap & PGCPUCAP_INIT);
+ Assert(pg_cpu_have(PG_CPU_FEATURE) == 1);
#endif
#if defined(HAVE_CRC_COMPTIME)
return COMP_CRC32C_HW(crc, data, len);
#else
- if (pg_cpucap & PGCPUCAP_CRC32C)
+ if (PGCPUCAP_CRC32C)
return COMP_CRC32C_HW(crc, data, len);
else
return pg_comp_crc32c_sb8(crc, data, len);
diff --git a/src/port/Makefile b/src/port/Makefile
index 1fc03713b3..5a05179e92 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -45,8 +45,6 @@ OBJS = \
path.o \
pg_bitutils.o \
pg_cpucap.o \
- pg_cpucap_x86.o \
- pg_cpucap_arm.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index baa8e16200..922ab1ad73 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -78,18 +78,6 @@ if host_system != 'windows'
replace_funcs_neg += [['pthread_barrier_wait']]
endif
-# arch-specific runtime checks
-if host_cpu == 'x86' or host_cpu == 'x86_64'
- pgport_sources += files(
- 'pg_cpucap_x86.c'
- )
-
-elif host_cpu == 'arm' or host_cpu == 'aarch64'
- pgport_sources += files(
- 'pg_cpucap_arm.c'
- )
-endif
-
# Replacement functionality to be built if corresponding configure symbol
# is true
replace_funcs_pos = [
diff --git a/src/port/pg_bitutils.c b/src/port/pg_bitutils.c
index 5677525693..4cf05f55a6 100644
--- a/src/port/pg_bitutils.c
+++ b/src/port/pg_bitutils.c
@@ -12,14 +12,8 @@
*/
#include "c.h"
-#ifdef HAVE__GET_CPUID
-#include <cpuid.h>
-#endif
-#ifdef HAVE__CPUID
-#include <intrin.h>
-#endif
-
#include "port/pg_bitutils.h"
+#include "port/pg_cpucap.h"
/*
@@ -133,17 +127,7 @@ uint64 (*pg_popcount_masked_optimized) (const char *buf, int bytes, bits8 mask)
static bool
pg_popcount_available(void)
{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
-
- return (exx[2] & (1 << 23)) != 0; /* POPCNT */
+ return pg_cpu_have(PG_CPU_FEATURE_POPCNT);
}
/*
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index 301bd9fc2c..b1e38065c7 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -15,20 +15,179 @@
#include "port/pg_cpucap.h"
+#ifdef HAVE__GET_CPUID
+#include <cpuid.h>
+#endif
-/* starts uninitialized so we can detect errors of omission */
-uint32 pg_cpucap = 0;
+#ifdef HAVE__CPUID
+#include <intrin.h>
+#endif
+
+static unsigned char pg_cpucap[PG_CPU_FEATURE_MAX];
+
+#ifdef __x86_64__
+// for _xgetbv
+#include <immintrin.h>
+/*
+ * Does XGETBV say the ZMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+zmm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0xe6) == 0xe6;
+#else
+ return false;
+#endif
+}
+
+static void pg_cpuid(int leaf, int subleaf, unsigned int* exx)
+{
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(leaf, subleaf, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, leaf, subleaf);
+#else
+#error cpuid instruction not available
+#endif
+}
+
+static void
+pg_cpucap_x86(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+ pg_cpuid(1, 0, exx);
+
+ pg_cpucap[PG_CPU_FEATURE_SSE42] = (exx[2] & (1 << 20)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_PCLMUL] = (exx[2] & (1 << 1)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_POPCNT] = (exx[2] & (1 << 23)) != 0;
+ /* osxsave */
+ if ((exx[2] & (1 << 27)) == 0) {
+ return;
+ }
+ /* avx512 os support */
+ if (zmm_regs_available()) {
+ return;
+ }
+ /* second cpuid call on leaf 7 to check extended avx512 support */
+ pg_cpuid(7, 0, exx);
+
+ pg_cpucap[PG_CPU_FEATURE_AVX512F] = (exx[1] & (1 << 16)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_AVX512BW] = (exx[1] & (1 << 30)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_AVX512VPOPCNTDQ] = (exx[2] & (1 << 14)) != 0;
+
+}
+#else // ARM
+static bool
+pg_crc32c_armv8_available(void)
+{
+#if defined(HAVE_ELF_AUX_INFO)
+ unsigned long value;
+
+#ifdef __aarch64__
+ return elf_aux_info(AT_HWCAP, &value, sizeof(value)) == 0 &&
+ (value & HWCAP_CRC32) != 0;
+#else
+ return elf_aux_info(AT_HWCAP2, &value, sizeof(value)) == 0 &&
+ (value & HWCAP2_CRC32) != 0;
+#endif
+#elif defined(HAVE_GETAUXVAL)
+#ifdef __aarch64__
+ return (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0;
+#else
+ return (getauxval(AT_HWCAP2) & HWCAP2_CRC32) != 0;
+#endif
+#elif defined(__NetBSD__)
+ /*
+ * On NetBSD we can read the Instruction Set Attribute Registers via
+ * sysctl. For doubtless-historical reasons the sysctl interface is
+ * completely different on 64-bit than 32-bit, but the underlying
+ * registers contain the same fields.
+ */
+#define ISAR0_CRC32_BITPOS 16
+#define ISAR0_CRC32_BITWIDTH 4
+#define WIDTHMASK(w) ((1 << (w)) - 1)
+#define SYSCTL_CPU_ID_MAXSIZE 64
+
+ size_t len;
+ uint64 sysctlbuf[SYSCTL_CPU_ID_MAXSIZE];
+#if defined(__aarch64__)
+ /* We assume cpu0 is representative of all the machine's CPUs. */
+ const char *path = "machdep.cpu0.cpu_id";
+ size_t expected_len = sizeof(struct aarch64_sysctl_cpu_id);
+#define ISAR0 ((struct aarch64_sysctl_cpu_id *) sysctlbuf)->ac_aa64isar0
+#else
+ const char *path = "machdep.id_isar";
+ size_t expected_len = 6 * sizeof(int);
+#define ISAR0 ((int *) sysctlbuf)[5]
+#endif
+ uint64 fld;
+
+ /* Fetch the appropriate set of register values. */
+ len = sizeof(sysctlbuf);
+ memset(sysctlbuf, 0, len);
+ if (sysctlbyname(path, sysctlbuf, &len, NULL, 0) != 0)
+ return false; /* perhaps kernel is 64-bit and we aren't? */
+ if (len != expected_len)
+ return false; /* kernel API change? */
+
+ /* Fetch the CRC32 field from ISAR0. */
+ fld = (ISAR0 >> ISAR0_CRC32_BITPOS) & WIDTHMASK(ISAR0_CRC32_BITWIDTH);
+
+ /*
+ * Current documentation defines only the field values 0 (No CRC32) and 1
+ * (CRC32B/CRC32H/CRC32W/CRC32X/CRC32CB/CRC32CH/CRC32CW/CRC32CX). Assume
+ * that any future nonzero value will be a superset of 1.
+ */
+ return (fld != 0);
+#else
+ return false;
+#endif
+}
+
+static void
+pg_cpucap_arm(void)
+{
+ if (pg_crc32c_armv8_available()) {
+ pg_cpucap[PG_CPU_FEATURE_ARMV8_CRC32C] = 1;
+ }
+}
+#endif
+
+
+static void
+pg_cpucap_arch()
+{
+ /* WIP: configure checks */
+#ifdef __x86_64__
+ pg_cpucap_x86();
+#else // ARM:
+ pg_cpucap_arm();
+#endif
+}
/*
* This needs to be called in main() for every
* program that calls a function that dispatches
* according to CPU features.
*/
-void
-pg_cpucap_initialize(void)
+void pg_cpucap_initialize(void)
{
- pg_cpucap = PGCPUCAP_INIT;
+ /* Initialize everything to zero */
+ memset(pg_cpucap, 0, sizeof(pg_cpucap[0]) * PG_CPU_FEATURE_MAX);
+ pg_cpucap[PG_CPU_FEATURE_INIT] = 1;
+
+ pg_cpucap_arch();
+}
- pg_cpucap_crc32c();
- pg_cpucap_clmul();
+/* Access to pg_cpucap for modules that need runtime CPUID information */
+bool pg_cpu_have(int feature_id)
+{
+ return pg_cpucap[feature_id];
}
diff --git a/src/port/pg_cpucap_arm.c b/src/port/pg_cpucap_arm.c
deleted file mode 100644
index e080a5a931..0000000000
--- a/src/port/pg_cpucap_arm.c
+++ /dev/null
@@ -1,119 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * pg_cpucap_arm.c
- * Check if the CPU we're running on supports the ARMv8 CRC Extension.
- *
- * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- *
- * IDENTIFICATION
- * src/port/pg_cpucap_arm.c
- *
- *-------------------------------------------------------------------------
- */
-
-#ifndef FRONTEND
-#include "postgres.h"
-#else
-#include "postgres_fe.h"
-#endif
-
-#if defined(HAVE_ELF_AUX_INFO) || defined(HAVE_GETAUXVAL)
-#include <sys/auxv.h>
-#if defined(__linux__) && !defined(__aarch64__) && !defined(HWCAP2_CRC32)
-#include <asm/hwcap.h>
-#endif
-#endif
-
-#if defined(__NetBSD__)
-#include <sys/sysctl.h>
-#if defined(__aarch64__)
-#include <aarch64/armreg.h>
-#endif
-#endif
-
-#include "port/pg_crc32c.h"
-
-static bool
-pg_crc32c_armv8_available(void)
-{
-#if defined(HAVE_ELF_AUX_INFO)
- unsigned long value;
-
-#ifdef __aarch64__
- return elf_aux_info(AT_HWCAP, &value, sizeof(value)) == 0 &&
- (value & HWCAP_CRC32) != 0;
-#else
- return elf_aux_info(AT_HWCAP2, &value, sizeof(value)) == 0 &&
- (value & HWCAP2_CRC32) != 0;
-#endif
-#elif defined(HAVE_GETAUXVAL)
-#ifdef __aarch64__
- return (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0;
-#else
- return (getauxval(AT_HWCAP2) & HWCAP2_CRC32) != 0;
-#endif
-#elif defined(__NetBSD__)
- /*
- * On NetBSD we can read the Instruction Set Attribute Registers via
- * sysctl. For doubtless-historical reasons the sysctl interface is
- * completely different on 64-bit than 32-bit, but the underlying
- * registers contain the same fields.
- */
-#define ISAR0_CRC32_BITPOS 16
-#define ISAR0_CRC32_BITWIDTH 4
-#define WIDTHMASK(w) ((1 << (w)) - 1)
-#define SYSCTL_CPU_ID_MAXSIZE 64
-
- size_t len;
- uint64 sysctlbuf[SYSCTL_CPU_ID_MAXSIZE];
-#if defined(__aarch64__)
- /* We assume cpu0 is representative of all the machine's CPUs. */
- const char *path = "machdep.cpu0.cpu_id";
- size_t expected_len = sizeof(struct aarch64_sysctl_cpu_id);
-#define ISAR0 ((struct aarch64_sysctl_cpu_id *) sysctlbuf)->ac_aa64isar0
-#else
- const char *path = "machdep.id_isar";
- size_t expected_len = 6 * sizeof(int);
-#define ISAR0 ((int *) sysctlbuf)[5]
-#endif
- uint64 fld;
-
- /* Fetch the appropriate set of register values. */
- len = sizeof(sysctlbuf);
- memset(sysctlbuf, 0, len);
- if (sysctlbyname(path, sysctlbuf, &len, NULL, 0) != 0)
- return false; /* perhaps kernel is 64-bit and we aren't? */
- if (len != expected_len)
- return false; /* kernel API change? */
-
- /* Fetch the CRC32 field from ISAR0. */
- fld = (ISAR0 >> ISAR0_CRC32_BITPOS) & WIDTHMASK(ISAR0_CRC32_BITWIDTH);
-
- /*
- * Current documentation defines only the field values 0 (No CRC32) and 1
- * (CRC32B/CRC32H/CRC32W/CRC32X/CRC32CB/CRC32CH/CRC32CW/CRC32CX). Assume
- * that any future nonzero value will be a superset of 1.
- */
- return (fld != 0);
-#else
- return false;
-#endif
-}
-
-/*
- * Check if hardware instructions for CRC computation are available.
- */
-void
-pg_cpucap_crc32c(void)
-{
- if (pg_crc32c_armv8_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-}
-
-void
-pg_cpucap_clmul(void)
-{
- // WIP: does this even make sense?
-}
diff --git a/src/port/pg_cpucap_x86.c b/src/port/pg_cpucap_x86.c
deleted file mode 100644
index 3a62a3a582..0000000000
--- a/src/port/pg_cpucap_x86.c
+++ /dev/null
@@ -1,75 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * pg_cpucap_x86.c
- * Check if the CPU we're running on supports SSE4.2.
- *
- * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- *
- * IDENTIFICATION
- * src/port/pg_cpucap_x86.c
- *
- *-------------------------------------------------------------------------
- */
-
-#include "c.h"
-
-#ifdef HAVE__GET_CPUID
-#include <cpuid.h>
-#endif
-
-#ifdef HAVE__CPUID
-#include <intrin.h>
-#endif
-
-#include "port/pg_cpucap.h"
-
-static bool
-pg_sse42_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
-
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
-}
-
-static bool
-pg_pclmul_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
-
- return (exx[2] & (1 << 1)) != 0; /* PCLMUL */
-}
-
-/*
- * Check if hardware instructions for CRC computation are available.
- */
-void
-pg_cpucap_crc32c(void)
-{
- if (pg_sse42_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-}
-
-void
-pg_cpucap_clmul(void)
-{
- if (pg_pclmul_available())
- pg_cpucap |= PGCPUCAP_CLMUL;
-}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index fc3cf0d088..7131b4a326 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -18,6 +18,7 @@
#include <wmmintrin.h>
#include "port/pg_crc32c.h"
+#include "port/pg_cpucap.h"
/* WIP: configure checks */
#ifdef __x86_64__
@@ -140,7 +141,7 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
#ifdef HAVE_PCLMUL_RUNTIME
- if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
+ if (len >= PCLMUL_THRESHOLD && (pg_cpu_have(PG_CPU_FEATURE_PCLMUL)))
{
return pg_comp_crc32c_pclmul(crc, data, len);
}
diff --git a/src/port/pg_popcount_avx512.c b/src/port/pg_popcount_avx512.c
index dac895a0fc..7f5846b1bd 100644
--- a/src/port/pg_popcount_avx512.c
+++ b/src/port/pg_popcount_avx512.c
@@ -14,17 +14,10 @@
#ifdef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
-#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
-#include <cpuid.h>
-#endif
-
#include <immintrin.h>
-#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
-#include <intrin.h>
-#endif
-
#include "port/pg_bitutils.h"
+#include "port/pg_cpucap.h"
/*
* It's probably unlikely that TRY_POPCNT_FAST won't be set if we are able to
@@ -33,63 +26,6 @@
*/
#ifdef TRY_POPCNT_FAST
-/*
- * Does CPUID say there's support for XSAVE instructions?
- */
-static inline bool
-xsave_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
- return (exx[2] & (1 << 27)) != 0; /* osxsave */
-}
-
-/*
- * Does XGETBV say the ZMM registers are enabled?
- *
- * NB: Caller is responsible for verifying that xsave_available() returns true
- * before calling this.
- */
-#ifdef HAVE_XSAVE_INTRINSICS
-pg_attribute_target("xsave")
-#endif
-static inline bool
-zmm_regs_available(void)
-{
-#ifdef HAVE_XSAVE_INTRINSICS
- return (_xgetbv(0) & 0xe6) == 0xe6;
-#else
- return false;
-#endif
-}
-
-/*
- * Does CPUID say there's support for AVX-512 popcount and byte-and-word
- * instructions?
- */
-static inline bool
-avx512_popcnt_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID_COUNT)
- __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUIDEX)
- __cpuidex(exx, 7, 0);
-#else
-#error cpuid instruction not available
-#endif
- return (exx[2] & (1 << 14)) != 0 && /* avx512-vpopcntdq */
- (exx[1] & (1 << 30)) != 0; /* avx512-bw */
-}
-
/*
* Returns true if the CPU supports the instructions required for the AVX-512
* pg_popcount() implementation.
@@ -97,9 +33,8 @@ avx512_popcnt_available(void)
bool
pg_popcount_avx512_available(void)
{
- return xsave_available() &&
- zmm_regs_available() &&
- avx512_popcnt_available();
+ return pg_cpu_have(PG_CPU_FEATURE_AVX512VPOPCNTDQ) &&
+ pg_cpu_have(PG_CPU_FEATURE_AVX512BW);
}
/*
--
2.43.0
On Wed, Feb 26, 2025 at 7:21 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
I agree it would be preferable to make a centralized check work.
Here is my first stab at it. v9 is same as v8 + a commit to move all cpuid checks into one single place including the AVX512 popcount code. Any new feature that requires CPUID information can access that information with pg_cpu_have[FEATURE] defined in pg_cpucap.h and initialized in pg_cpucap.c. v8 also had a typo in configure files which caused a build failure. Its fixed in v9.
Pretty sure the ARM code paths need some correction. Let me know what you think.
Thanks, I think this is a good direction. Some comments:
+static void pg_cpuid(int leaf, int subleaf, unsigned int* exx)
+{
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(leaf, subleaf, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, leaf, subleaf);
+#else
+#error cpuid instruction not available
+#endif
+}
Our current configure still looks for __get_cpuid and __cpuid. We
committed checking these new ones fairly recently, and they were
further gated by USE_AVX512_POPCNT_WITH_RUNTIME_CHECK. It seems like
here we should do something like the following, where "+" lines are
from the patch and other lines are mine:
+static void
+pg_cpucap_x86(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
#if defined(HAVE__GET_CPUID)
__get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
#elif defined(HAVE__CPUID)
__cpuid(exx, 1);
#endif
+ pg_cpucap[PG_CPU_FEATURE_SSE42] = (exx[2] & (1 << 20)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_PCLMUL] = (exx[2] & (1 << 1)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_POPCNT] = (exx[2] & (1 << 23)) != 0;
+ /* osxsave */
+ if ((exx[2] & (1 << 27)) == 0) {
+ return;
+ }
+ /* avx512 os support */
+ if (zmm_regs_available()) {
+ return;
+ }
// BTW, I do like the gating here that reduces the number of places
that have to know about zmm and xsave. (Side note: shouldn't that be
"if !(zmm_regs_available())"?)
+ /* reset for second cpuid call on leaf 7 to check extended avx512
support */
exx[4] = {0, 0, 0, 0};
#if defined(HAVE__GET_CPUID_COUNT)
__get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
#elif defined(HAVE__CPUIDEX)
__cpuidex(exx, 7, 0);
#endif
+ pg_cpucap[PG_CPU_FEATURE_AVX512F] = (exx[1] & (1 << 16)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_AVX512BW] = (exx[1] & (1 << 30)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_AVX512VPOPCNTDQ] = (exx[2] & (1 << 14)) != 0;
+
+}
What do you think?
-#define PGCPUCAP_INIT (1 << 0)
-#define PGCPUCAP_POPCNT (1 << 1)
-#define PGCPUCAP_VPOPCNT (1 << 2)
-#define PGCPUCAP_CRC32C (1 << 3)
-#define PGCPUCAP_CLMUL (1 << 4)
+enum pg_cpucap__
+{
+ PG_CPU_FEATURE_INIT = 0,
+ // X86
+ PG_CPU_FEATURE_SSE42 = 1,
+ PG_CPU_FEATURE_POPCNT = 2,
+ PG_CPU_FEATURE_PCLMUL = 3,
[...]
+ PG_CPU_FEATURE_ARMV8_CRC32C = 100,
I'm not a fan of exposing these architecture-specific details to
places that consult the capabilities. That requires things like
+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_SSE42)
[...]
+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_ARMV8_CRC32C)
I'd prefer to have 1 capability <-> one place to check at runtime for
architectures that need that, and to keep architecture details private
to the initialization step. . Even for things that test for which
function pointer to use, I think it's a cleaner interface to look at
one thing.
+static void
+pg_cpucap_arch()
+{
+ /* WIP: configure checks */
+#ifdef __x86_64__
+ pg_cpucap_x86();
+#else // ARM:
+ pg_cpucap_arm();
+#endif
+}
If we're going to have a single file for the init step, we don't need
this -- we'd just have a different definition of
pg_cpucap_initialize() in each part, with a default that only adds the
"init" slot:
#if defined( __i386__ ) || defined(i386) || defined(_M_IX86) ||
defined(__x86_64__) || defined(__x86_64) || defined(_M_AMD64)
<cpuid checks>
#elif defined(__arm__) || defined(__arm) ||
defined(__aarch64__) || defined(_M_ARM64)
<for now only pg_crc32c_armv8_available()>
#else
<only init>
#endif
--
John Naylor
Amazon Web Services
Attached v10 with minor changes based on the feedback.
Thanks, I think this is a good direction. Some comments:
#if defined(HAVE__GET_CPUID)
__get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]); #elif defined(HAVE__CPUID)
__cpuid(exx, 1); #endif
Oh yeah good catch. Fixed in v10.
+ } + /* avx512 os support */ + if (zmm_regs_available()) { + return; + }// BTW, I do like the gating here that reduces the number of places that have to
know about zmm and xsave. (Side note: shouldn't that be "if
!(zmm_regs_available())"?)
Yup, good catch again 😊
What do you think?
Yup, those look correct to me. Fixed them in v10.
+ PG_CPU_FEATURE_PCLMUL = 3, [...] + PG_CPU_FEATURE_ARMV8_CRC32C = 100,I'm not a fan of exposing these architecture-specific details to places that consult
the capabilities. That requires things like+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_SSE42) [...] +#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_ARMV8_CRC32C)I'd prefer to have 1 capability <-> one place to check at runtime for architectures
that need that, and to keep architecture details private to the initialization step. .
Even for things that test for which function pointer to use, I think it's a cleaner
interface to look at one thing.
Isn't that one thing currently pg_cpu_have(FEATURE)? The reason we have the additional PGCPUCAP_CRC32C is because the dispatch for CRC32C is currently defined in the header file common to both ARM and SSE42. We should ideally have separate dispatch for ARM and x86 in their own files (the way it is in the main branch). This also makes it easier to add more implementations in the future without having to make the dispatch function work for both ARM and x86.
If we're going to have a single file for the init step, we don't need this -- we'd just
have a different definition of
pg_cpucap_initialize() in each part, with a default that only adds the "init" slot:#if defined( __i386__ ) || defined(i386) || defined(_M_IX86) ||
defined(__x86_64__) || defined(__x86_64) || defined(_M_AMD64)<cpuid checks>
#elif defined(__arm__) || defined(__arm) ||
defined(__aarch64__) || defined(_M_ARM64)<for now only pg_crc32c_armv8_available()>
#else
<only init>
#endif
Makes sense. Got rid of it in v10.
Raghuveer
Attachments:
v10-0001-Dispatch-CRC-computation-by-branching-rather-tha.patchapplication/octet-stream; name=v10-0001-Dispatch-CRC-computation-by-branching-rather-tha.patchDownload
From 2a8a44c7fe8cfed6c7298533d633688cd2efd0b3 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Sat, 15 Feb 2025 19:18:16 +0700
Subject: [PATCH v10 1/5] Dispatch CRC computation by branching rather than
indirect calls
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
src/backend/postmaster/postmaster.c | 4 ++
src/include/port/pg_cpucap.h | 25 +++++++++
src/include/port/pg_crc32c.h | 78 +++++++++++++++++++++--------
src/port/Makefile | 1 +
src/port/meson.build | 4 ++
src/port/pg_cpucap.c | 51 +++++++++++++++++++
src/port/pg_crc32c_armv8_choose.c | 26 +---------
src/port/pg_crc32c_sse42_choose.c | 26 +---------
8 files changed, 145 insertions(+), 70 deletions(-)
create mode 100644 src/include/port/pg_cpucap.h
create mode 100644 src/port/pg_cpucap.c
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bb22b13ade..4fa95f1d2c 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -99,6 +99,7 @@
#include "pg_getopt.h"
#include "pgstat.h"
#include "port/pg_bswap.h"
+#include "port/pg_cpucap.h"
#include "postmaster/autovacuum.h"
#include "postmaster/bgworker_internals.h"
#include "postmaster/pgarch.h"
@@ -1951,6 +1952,9 @@ InitProcessGlobals(void)
#ifndef WIN32
srandom(pg_prng_uint32(&pg_global_prng_state));
#endif
+
+ /* detect CPU capabilities */
+ pg_cpucap_initialize();
}
/*
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
new file mode 100644
index 0000000000..81edfedce5
--- /dev/null
+++ b/src/include/port/pg_cpucap.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpucap.h
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/include/port/pg_cpucap.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CPUCAP_H
+#define PG_CPUCAP_H
+
+#define PGCPUCAP_INIT (1 << 0)
+#define PGCPUCAP_POPCNT (1 << 1)
+#define PGCPUCAP_VPOPCNT (1 << 2)
+#define PGCPUCAP_CRC32C (1 << 3)
+
+extern PGDLLIMPORT uint32 pg_cpucap;
+extern void pg_cpucap_initialize(void);
+
+#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b..b565a0f294 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -34,6 +34,7 @@
#define PG_CRC32C_H
#include "port/pg_bswap.h"
+#include "port/pg_cpucap.h"
typedef uint32 pg_crc32c;
@@ -41,52 +42,55 @@ typedef uint32 pg_crc32c;
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
-#if defined(USE_SSE42_CRC32C)
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
/* Use Intel SSE4.2 instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_SSE42_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_ARMV8_CRC32C)
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/* Use ARMv8 CRC Extension instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_armv8((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#if defined(USE_ARMV8_CRC32C)
+#define HAVE_CRC_COMPTIME
+#else
+#define HAVE_CRC_RUNTIME
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif
+
+extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
/* Use LoongArch CRCC instructions. */
#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
+#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_loongarch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+#define HAVE_CRC_COMPTIME
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
-
-/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
- * to check that they are available.
- */
-#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c((crc), (data), (len)))
-#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
-
-extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
-extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
-
#else
/*
* Use slicing-by-8 algorithm.
@@ -105,6 +109,36 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+#endif /* end of CPU-specfic symbols */
+
+#if defined(HAVE_CRC_COMPTIME) || defined(HAVE_CRC_RUNTIME)
+/*
+ * Check if the CPU we're running on supports special
+ * instructions for CRC-32C computation. Otherwise, fall
+ * back to the pure software implementation (slicing-by-8).
+ */
+static inline pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ /*
+ * If this is firing in a frontend program, first look if you forgot a
+ * call to pg_cpucap_initialize() in main(). See for example
+ * src/bin/pg_controldata/pg_controldata.c.
+ */
+ // WIP: how to best intialize in frontend?
+#ifndef FRONTEND
+ Assert(pg_cpucap & PGCPUCAP_INIT);
+#endif
+
+#if defined(HAVE_CRC_COMPTIME)
+ return COMP_CRC32C_HW(crc, data, len);
+#else
+ if (pg_cpucap & PGCPUCAP_CRC32C)
+ return COMP_CRC32C_HW(crc, data, len);
+ else
+ return pg_comp_crc32c_sb8(crc, data, len);
#endif
+}
+#endif /* HAVE_CRC_COMPTIME || HAVE_CRC_RUNTIME */
#endif /* PG_CRC32C_H */
diff --git a/src/port/Makefile b/src/port/Makefile
index 4c22431951..5a05179e92 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -44,6 +44,7 @@ OBJS = \
noblock.o \
path.o \
pg_bitutils.o \
+ pg_cpucap.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d4..e1e7ce8fb8 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -7,6 +7,7 @@ pgport_sources = [
'noblock.c',
'path.c',
'pg_bitutils.c',
+ 'pg_cpucap.c',
'pg_popcount_avx512.c',
'pg_strong_random.c',
'pgcheckdir.c',
@@ -83,12 +84,15 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ # WIP sometime we'll need to build these based on host_cpu
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
new file mode 100644
index 0000000000..eba6e31c63
--- /dev/null
+++ b/src/port/pg_cpucap.c
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_cpucap.c
+ * Runtime detection of CPU capabilities.
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * src/port/pg_cpucap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+
+#include "port/pg_cpucap.h"
+#include "port/pg_crc32c.h"
+
+
+/* starts uninitialized so we can detect errors of omission */
+uint32 pg_cpucap = 0;
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+static void
+pg_cpucap_crc32c(void)
+{
+ /* WIP: It seems like we should use CPU arch symbols instead */
+#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+
+#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+#endif
+}
+
+/*
+ * This needs to be called in main() for every
+ * program that calls a function that dispatches
+ * according to CPU features.
+ */
+void
+pg_cpucap_initialize(void)
+{
+ pg_cpucap = PGCPUCAP_INIT;
+
+ pg_cpucap_crc32c();
+}
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_crc32c_armv8_choose.c
index ec12be1bbc..e3654427c3 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_crc32c_armv8_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_armv8_choose.c
- * Choose between ARMv8 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports the ARMv8
- * CRC Extension. If it does, use the special instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -40,7 +35,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -106,20 +101,3 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_armv8_available())
- pg_comp_crc32c = pg_comp_crc32c_armv8;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d424..f4d3215bc5 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -1,12 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_sse42_choose.c
- * Choose between Intel SSE 4.2 and software CRC-32C implementation.
- *
- * On first call, checks if the CPU we're running on supports Intel SSE
- * 4.2. If it does, use the special SSE instructions for CRC-32C
- * computation. Otherwise, fall back to the pure software implementation
- * (slicing-by-8).
+ * Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -30,7 +25,7 @@
#include "port/pg_crc32c.h"
-static bool
+bool
pg_crc32c_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -45,20 +40,3 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
- pg_comp_crc32c = pg_comp_crc32c_sse42;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
-
- return pg_comp_crc32c(crc, data, len);
-}
-
-pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len) = pg_comp_crc32c_choose;
--
2.43.0
v10-0002-Rename-CRC-choose-files-to-cpucap-files.patchapplication/octet-stream; name=v10-0002-Rename-CRC-choose-files-to-cpucap-files.patchDownload
From bf13997ab601d8a91bc80e0e8cd11159f9c1eb25 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 25 Feb 2025 13:59:21 +0700
Subject: [PATCH v10 2/5] Rename CRC *choose files to cpucap* files
On Meson, build them unconditionally on the relevant arch.
FIXME autoconf builds are broken
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
configure | 4 ++--
configure.ac | 4 ++--
src/include/port/pg_cpucap.h | 3 +++
src/include/port/pg_crc32c.h | 2 --
src/port/Makefile | 2 ++
src/port/meson.build | 17 +++++++++++-----
src/port/pg_cpucap.c | 18 -----------------
..._crc32c_armv8_choose.c => pg_cpucap_arm.c} | 16 ++++++++++++---
..._crc32c_sse42_choose.c => pg_cpucap_x86.c} | 20 ++++++++++++++-----
9 files changed, 49 insertions(+), 37 deletions(-)
rename src/port/{pg_crc32c_armv8_choose.c => pg_cpucap_arm.c} (92%)
rename src/port/{pg_crc32c_sse42_choose.c => pg_cpucap_x86.c} (73%)
diff --git a/configure b/configure
index 0ffcaeb436..0d31e6a236 100755
--- a/configure
+++ b/configure
@@ -17360,7 +17360,7 @@ else
$as_echo "#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime check" >&5
$as_echo "SSE 4.2 with runtime check" >&6; }
else
@@ -17376,7 +17376,7 @@ $as_echo "ARMv8 CRC instructions" >&6; }
$as_echo "#define USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions with runtime check" >&5
$as_echo "ARMv8 CRC instructions with runtime check" >&6; }
else
diff --git a/configure.ac b/configure.ac
index f56681e0d9..60d30f855d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2115,7 +2115,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
AC_MSG_RESULT(SSE 4.2 with runtime check)
else
if test x"$USE_ARMV8_CRC32C" = x"1"; then
@@ -2125,7 +2125,7 @@ else
else
if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use ARMv8 CRC Extension with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
AC_MSG_RESULT(ARMv8 CRC instructions with runtime check)
else
if test x"$USE_LOONGARCH_CRC32C" = x"1"; then
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index 81edfedce5..5e04213b21 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -22,4 +22,7 @@
extern PGDLLIMPORT uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
+/* arch-specific functions private to src/port */
+extern void pg_cpucap_crc32c(void);
+
#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index b565a0f294..4f0ebb9923 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -57,7 +57,6 @@ typedef uint32 pg_crc32c;
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
#endif
-extern bool pg_crc32c_sse42_available(void);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
@@ -76,7 +75,6 @@ extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
#endif
-extern bool pg_crc32c_armv8_available(void);
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
#elif defined(USE_LOONGARCH_CRC32C)
diff --git a/src/port/Makefile b/src/port/Makefile
index 5a05179e92..1fc03713b3 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -45,6 +45,8 @@ OBJS = \
path.o \
pg_bitutils.o \
pg_cpucap.o \
+ pg_cpucap_x86.o \
+ pg_cpucap_arm.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index e1e7ce8fb8..baa8e16200 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -78,22 +78,29 @@ if host_system != 'windows'
replace_funcs_neg += [['pthread_barrier_wait']]
endif
+# arch-specific runtime checks
+if host_cpu == 'x86' or host_cpu == 'x86_64'
+ pgport_sources += files(
+ 'pg_cpucap_x86.c'
+ )
+
+elif host_cpu == 'arm' or host_cpu == 'aarch64'
+ pgport_sources += files(
+ 'pg_cpucap_arm.c'
+ )
+endif
+
# Replacement functionality to be built if corresponding configure symbol
# is true
replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
- # WIP sometime we'll need to build these based on host_cpu
- ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
- ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
# arm / aarch64
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C'],
['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
- ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C'],
- ['pg_crc32c_armv8_choose', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
# loongarch
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index eba6e31c63..88d7582702 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -14,29 +14,11 @@
#include "c.h"
#include "port/pg_cpucap.h"
-#include "port/pg_crc32c.h"
/* starts uninitialized so we can detect errors of omission */
uint32 pg_cpucap = 0;
-/*
- * Check if hardware instructions for CRC computation are available.
- */
-static void
-pg_cpucap_crc32c(void)
-{
- /* WIP: It seems like we should use CPU arch symbols instead */
-#if defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
- if (pg_crc32c_sse42_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-
-#elif defined(USE_ARMV8_CRC32C) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
- if (pg_crc32c_armv8_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-#endif
-}
-
/*
* This needs to be called in main() for every
* program that calls a function that dispatches
diff --git a/src/port/pg_crc32c_armv8_choose.c b/src/port/pg_cpucap_arm.c
similarity index 92%
rename from src/port/pg_crc32c_armv8_choose.c
rename to src/port/pg_cpucap_arm.c
index e3654427c3..19e052fecf 100644
--- a/src/port/pg_crc32c_armv8_choose.c
+++ b/src/port/pg_cpucap_arm.c
@@ -1,6 +1,6 @@
/*-------------------------------------------------------------------------
*
- * pg_crc32c_armv8_choose.c
+ * pg_cpucap_arm.c
* Check if the CPU we're running on supports the ARMv8 CRC Extension.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
- * src/port/pg_crc32c_armv8_choose.c
+ * src/port/pg_cpucap_arm.c
*
*-------------------------------------------------------------------------
*/
@@ -35,7 +35,7 @@
#include "port/pg_crc32c.h"
-bool
+static bool
pg_crc32c_armv8_available(void)
{
#if defined(HAVE_ELF_AUX_INFO)
@@ -101,3 +101,13 @@ pg_crc32c_armv8_available(void)
return false;
#endif
}
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+void
+pg_cpucap_crc32c(void)
+{
+ if (pg_crc32c_armv8_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+}
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_cpucap_x86.c
similarity index 73%
rename from src/port/pg_crc32c_sse42_choose.c
rename to src/port/pg_cpucap_x86.c
index f4d3215bc5..07462bd1d2 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_cpucap_x86.c
@@ -1,6 +1,6 @@
/*-------------------------------------------------------------------------
*
- * pg_crc32c_sse42_choose.c
+ * pg_cpucap_x86.c
* Check if the CPU we're running on supports SSE4.2.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
- * src/port/pg_crc32c_sse42_choose.c
+ * src/port/pg_cpucap_x86.c
*
*-------------------------------------------------------------------------
*/
@@ -23,10 +23,10 @@
#include <intrin.h>
#endif
-#include "port/pg_crc32c.h"
+#include "port/pg_cpucap.h"
-bool
-pg_crc32c_sse42_available(void)
+static bool
+pg_sse42_available(void)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -40,3 +40,13 @@ pg_crc32c_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+
+/*
+ * Check if hardware instructions for CRC computation are available.
+ */
+void
+pg_cpucap_crc32c(void)
+{
+ if (pg_sse42_available())
+ pg_cpucap |= PGCPUCAP_CRC32C;
+}
--
2.43.0
v10-0003-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchapplication/octet-stream; name=v10-0003-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchDownload
From a2cce7a4f4b1d6064e65ae47942d0609e758d706 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v10 3/5] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67..06673db062 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 0000000000..5b747c6184
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 0000000000..dff6bb3133
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 0000000000..d7bec4ba1c
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 0000000000..95c6dfe448
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 0000000000..52b9772f90
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 0000000000..28bc42de31
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 0000000000..878a077ee1
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.43.0
v10-0004-Improve-CRC32C-performance-on-x86_64.patchapplication/octet-stream; name=v10-0004-Improve-CRC32C-performance-on-x86_64.patchDownload
From ea97c1e30605473cd73e5f3ffe2ac966b9f0c180 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v10 4/5] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
Signed-off-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
---
src/include/port/pg_cpucap.h | 2 +
src/port/pg_cpucap.c | 1 +
src/port/pg_cpucap_arm.c | 6 ++
src/port/pg_cpucap_x86.c | 23 +++++
src/port/pg_crc32c_sse42.c | 123 ++++++++++++++++++++++++++
src/test/regress/expected/strings.out | 24 +++++
src/test/regress/sql/strings.sql | 4 +
7 files changed, 183 insertions(+)
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index 5e04213b21..af3fabfcff 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -18,11 +18,13 @@
#define PGCPUCAP_POPCNT (1 << 1)
#define PGCPUCAP_VPOPCNT (1 << 2)
#define PGCPUCAP_CRC32C (1 << 3)
+#define PGCPUCAP_CLMUL (1 << 4)
extern PGDLLIMPORT uint32 pg_cpucap;
extern void pg_cpucap_initialize(void);
/* arch-specific functions private to src/port */
extern void pg_cpucap_crc32c(void);
+extern void pg_cpucap_clmul(void);
#endif /* PG_CPUCAP_H */
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index 88d7582702..301bd9fc2c 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -30,4 +30,5 @@ pg_cpucap_initialize(void)
pg_cpucap = PGCPUCAP_INIT;
pg_cpucap_crc32c();
+ pg_cpucap_clmul();
}
diff --git a/src/port/pg_cpucap_arm.c b/src/port/pg_cpucap_arm.c
index 19e052fecf..e080a5a931 100644
--- a/src/port/pg_cpucap_arm.c
+++ b/src/port/pg_cpucap_arm.c
@@ -111,3 +111,9 @@ pg_cpucap_crc32c(void)
if (pg_crc32c_armv8_available())
pg_cpucap |= PGCPUCAP_CRC32C;
}
+
+void
+pg_cpucap_clmul(void)
+{
+ // WIP: does this even make sense?
+}
diff --git a/src/port/pg_cpucap_x86.c b/src/port/pg_cpucap_x86.c
index 07462bd1d2..3a62a3a582 100644
--- a/src/port/pg_cpucap_x86.c
+++ b/src/port/pg_cpucap_x86.c
@@ -41,6 +41,22 @@ pg_sse42_available(void)
return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
+static bool
+pg_pclmul_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+ return (exx[2] & (1 << 1)) != 0; /* PCLMUL */
+}
+
/*
* Check if hardware instructions for CRC computation are available.
*/
@@ -50,3 +66,10 @@ pg_cpucap_crc32c(void)
if (pg_sse42_available())
pg_cpucap |= PGCPUCAP_CRC32C;
}
+
+void
+pg_cpucap_clmul(void)
+{
+ if (pg_pclmul_available())
+ pg_cpucap |= PGCPUCAP_CLMUL;
+}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df3..fc3cf0d088 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,9 +15,118 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define HAVE_PCLMUL_RUNTIME
+#endif
+
+ /*
+ * WIP: Testing has shown that on Kaby Lake (2016) this algorithm needs two
+ * iterations of the main loop to be faster than using regular CRC
+ * instrutions, but Tiger Lake (2020) is fine with a single iteration. Could
+ * use more testing between those years (on AMD as well).
+ */
+#define PCLMUL_THRESHOLD 128
+
+#ifdef HAVE_PCLMUL_RUNTIME
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+static pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ // This prolog is trying to avoid loads straddling
+ // cache lines, but it doesn't seem worth it if
+ // we're trying to be fast on small inputs as well
+#if 0
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t) buf & 8) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+#endif
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ len = end - buf;
+ }
+ for (; len >= 8; buf += 8, len -= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ }
+ for (; len; --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+
+ return crc0;
+}
+
+#endif
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
@@ -26,6 +135,17 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
const unsigned char *p = data;
const unsigned char *pend = p + len;
+ /* XXX not for commit */
+ const pg_crc32c orig_crc PG_USED_FOR_ASSERTS_ONLY = crc;
+ const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
+
+#ifdef HAVE_PCLMUL_RUNTIME
+ if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
+ {
+ return pg_comp_crc32c_pclmul(crc, data, len);
+ }
+#endif
+
/*
* Process eight bytes of data at a time.
*
@@ -66,5 +186,8 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
p++;
}
+ /* XXX not for commit */
+ Assert(crc == pg_comp_crc32c_sb8(orig_crc, data, orig_len));
+
return crc;
}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b65bb2d536..662bd37ace 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2282,6 +2282,30 @@ SELECT crc32c('The quick brown fox jumps over the lazy dog.');
419469235
(1 row)
+SELECT crc32c(repeat('A', 80)::bytea);
+ crc32c
+------------
+ 3799127650
+(1 row)
+
+SELECT crc32c(repeat('A', 127)::bytea);
+ crc32c
+-----------
+ 291820082
+(1 row)
+
+SELECT crc32c(repeat('A', 128)::bytea);
+ crc32c
+-----------
+ 816091258
+(1 row)
+
+SELECT crc32c(repeat('A', 129)::bytea);
+ crc32c
+------------
+ 4213642571
+(1 row)
+
--
-- encode/decode
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 8e0f3a0e75..26f86dc92e 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -727,6 +727,10 @@ SELECT crc32('The quick brown fox jumps over the lazy dog.');
SELECT crc32c('');
SELECT crc32c('The quick brown fox jumps over the lazy dog.');
+SELECT crc32c(repeat('A', 80)::bytea);
+SELECT crc32c(repeat('A', 127)::bytea);
+SELECT crc32c(repeat('A', 128)::bytea);
+SELECT crc32c(repeat('A', 129)::bytea);
--
-- encode/decode
--
2.43.0
v10-0005-Move-all-cpuid-checks-to-one-location.patchapplication/octet-stream; name=v10-0005-Move-all-cpuid-checks-to-one-location.patchDownload
From e8bf08ed784551daa77fccd9485aef0e2844dd7e Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Tue, 25 Feb 2025 12:54:43 -0800
Subject: [PATCH v10 5/5] Move all cpuid checks to one location
---
configure | 4 +-
configure.ac | 4 +-
src/include/port/pg_cpucap.h | 32 +++++--
src/include/port/pg_crc32c.h | 6 +-
src/port/Makefile | 2 -
src/port/meson.build | 12 ---
src/port/pg_bitutils.c | 20 +---
src/port/pg_cpucap.c | 167 ++++++++++++++++++++++++++++++++--
src/port/pg_cpucap_arm.c | 119 ------------------------
src/port/pg_cpucap_x86.c | 75 ---------------
src/port/pg_crc32c_sse42.c | 3 +-
src/port/pg_popcount_avx512.c | 71 +--------------
12 files changed, 197 insertions(+), 318 deletions(-)
delete mode 100644 src/port/pg_cpucap_arm.c
delete mode 100644 src/port/pg_cpucap_x86.c
diff --git a/configure b/configure
index 0d31e6a236..172c93896c 100755
--- a/configure
+++ b/configure
@@ -17360,7 +17360,7 @@ else
$as_echo "#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime check" >&5
$as_echo "SSE 4.2 with runtime check" >&6; }
else
@@ -17376,7 +17376,7 @@ $as_echo "ARMv8 CRC instructions" >&6; }
$as_echo "#define USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions with runtime check" >&5
$as_echo "ARMv8 CRC instructions with runtime check" >&6; }
else
diff --git a/configure.ac b/configure.ac
index 60d30f855d..c7bfad98ac 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2115,7 +2115,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o"
AC_MSG_RESULT(SSE 4.2 with runtime check)
else
if test x"$USE_ARMV8_CRC32C" = x"1"; then
@@ -2125,7 +2125,7 @@ else
else
if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
AC_DEFINE(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use ARMv8 CRC Extension with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o"
AC_MSG_RESULT(ARMv8 CRC instructions with runtime check)
else
if test x"$USE_LOONGARCH_CRC32C" = x"1"; then
diff --git a/src/include/port/pg_cpucap.h b/src/include/port/pg_cpucap.h
index af3fabfcff..d623db43e4 100644
--- a/src/include/port/pg_cpucap.h
+++ b/src/include/port/pg_cpucap.h
@@ -14,17 +14,29 @@
#ifndef PG_CPUCAP_H
#define PG_CPUCAP_H
-#define PGCPUCAP_INIT (1 << 0)
-#define PGCPUCAP_POPCNT (1 << 1)
-#define PGCPUCAP_VPOPCNT (1 << 2)
-#define PGCPUCAP_CRC32C (1 << 3)
-#define PGCPUCAP_CLMUL (1 << 4)
+enum pg_cpucap__
+{
+ PG_CPU_FEATURE_INIT = 0,
+ // X86
+ PG_CPU_FEATURE_SSE42 = 1,
+ PG_CPU_FEATURE_POPCNT = 2,
+ PG_CPU_FEATURE_PCLMUL = 3,
-extern PGDLLIMPORT uint32 pg_cpucap;
-extern void pg_cpucap_initialize(void);
+ /* SKX: */
+ PG_CPU_FEATURE_AVX512F = 30,
+ PG_CPU_FEATURE_AVX512BW = 31,
+
+ /* ICX */
+ PG_CPU_FEATURE_AVX512VPOPCNTDQ = 40,
+
+ // ARM
+ PG_CPU_FEATURE_ARMV8_CRC32C = 100,
-/* arch-specific functions private to src/port */
-extern void pg_cpucap_crc32c(void);
-extern void pg_cpucap_clmul(void);
+ PG_CPU_FEATURE_MAX
+};
+
+
+extern void pg_cpucap_initialize(void);
+bool pg_cpu_have(int feature_id);
#endif /* PG_CPUCAP_H */
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 4f0ebb9923..41b648bea6 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -48,6 +48,7 @@ typedef uint32 pg_crc32c;
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_SSE42)
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
#if defined(USE_SSE42_CRC32C)
@@ -66,6 +67,7 @@ extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t le
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define COMP_CRC32C_HW(crc, data, len) \
((crc) = pg_comp_crc32c_armv8((crc), (data), (len)))
+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_ARMV8_CRC32C)
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
#if defined(USE_ARMV8_CRC32C)
@@ -125,13 +127,13 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
*/
// WIP: how to best intialize in frontend?
#ifndef FRONTEND
- Assert(pg_cpucap & PGCPUCAP_INIT);
+ Assert(pg_cpu_have(PG_CPU_FEATURE) == 1);
#endif
#if defined(HAVE_CRC_COMPTIME)
return COMP_CRC32C_HW(crc, data, len);
#else
- if (pg_cpucap & PGCPUCAP_CRC32C)
+ if (PGCPUCAP_CRC32C)
return COMP_CRC32C_HW(crc, data, len);
else
return pg_comp_crc32c_sb8(crc, data, len);
diff --git a/src/port/Makefile b/src/port/Makefile
index 1fc03713b3..5a05179e92 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -45,8 +45,6 @@ OBJS = \
path.o \
pg_bitutils.o \
pg_cpucap.o \
- pg_cpucap_x86.o \
- pg_cpucap_arm.o \
pg_popcount_avx512.o \
pg_strong_random.o \
pgcheckdir.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index baa8e16200..922ab1ad73 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -78,18 +78,6 @@ if host_system != 'windows'
replace_funcs_neg += [['pthread_barrier_wait']]
endif
-# arch-specific runtime checks
-if host_cpu == 'x86' or host_cpu == 'x86_64'
- pgport_sources += files(
- 'pg_cpucap_x86.c'
- )
-
-elif host_cpu == 'arm' or host_cpu == 'aarch64'
- pgport_sources += files(
- 'pg_cpucap_arm.c'
- )
-endif
-
# Replacement functionality to be built if corresponding configure symbol
# is true
replace_funcs_pos = [
diff --git a/src/port/pg_bitutils.c b/src/port/pg_bitutils.c
index 5677525693..4cf05f55a6 100644
--- a/src/port/pg_bitutils.c
+++ b/src/port/pg_bitutils.c
@@ -12,14 +12,8 @@
*/
#include "c.h"
-#ifdef HAVE__GET_CPUID
-#include <cpuid.h>
-#endif
-#ifdef HAVE__CPUID
-#include <intrin.h>
-#endif
-
#include "port/pg_bitutils.h"
+#include "port/pg_cpucap.h"
/*
@@ -133,17 +127,7 @@ uint64 (*pg_popcount_masked_optimized) (const char *buf, int bytes, bits8 mask)
static bool
pg_popcount_available(void)
{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
-
- return (exx[2] & (1 << 23)) != 0; /* POPCNT */
+ return pg_cpu_have(PG_CPU_FEATURE_POPCNT);
}
/*
diff --git a/src/port/pg_cpucap.c b/src/port/pg_cpucap.c
index 301bd9fc2c..422c8f4faf 100644
--- a/src/port/pg_cpucap.c
+++ b/src/port/pg_cpucap.c
@@ -15,20 +15,173 @@
#include "port/pg_cpucap.h"
+#ifdef HAVE__GET_CPUID
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE__CPUID
+#include <intrin.h>
+#endif
+
+static unsigned char pg_cpucap[PG_CPU_FEATURE_MAX];
+
+#ifdef __x86_64__
+// for _xgetbv
+#include <immintrin.h>
+/*
+ * Does XGETBV say the ZMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+zmm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0xe6) == 0xe6;
+#else
+ return false;
+#endif
+}
+
+static void
+pg_cpucap_x86(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#endif
+
+ pg_cpucap[PG_CPU_FEATURE_SSE42] = (exx[2] & (1 << 20)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_PCLMUL] = (exx[2] & (1 << 1)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_POPCNT] = (exx[2] & (1 << 23)) != 0;
+ /* osxsave */
+ if ((exx[2] & (1 << 27)) == 0) {
+ return;
+ }
+ /* avx512 os support */
+ if (!zmm_regs_available()) {
+ return;
+ }
+ /* second cpuid call on leaf 7 to check extended avx512 support */
+ memset(exx, 0, 4 * sizeof(exx[0]));
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, 7, 0);
+#endif
+
+ pg_cpucap[PG_CPU_FEATURE_AVX512F] = (exx[1] & (1 << 16)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_AVX512BW] = (exx[1] & (1 << 30)) != 0;
+ pg_cpucap[PG_CPU_FEATURE_AVX512VPOPCNTDQ] = (exx[2] & (1 << 14)) != 0;
+
+}
+#else // ARM
+static bool
+pg_crc32c_armv8_available(void)
+{
+#if defined(HAVE_ELF_AUX_INFO)
+ unsigned long value;
+
+#ifdef __aarch64__
+ return elf_aux_info(AT_HWCAP, &value, sizeof(value)) == 0 &&
+ (value & HWCAP_CRC32) != 0;
+#else
+ return elf_aux_info(AT_HWCAP2, &value, sizeof(value)) == 0 &&
+ (value & HWCAP2_CRC32) != 0;
+#endif
+#elif defined(HAVE_GETAUXVAL)
+#ifdef __aarch64__
+ return (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0;
+#else
+ return (getauxval(AT_HWCAP2) & HWCAP2_CRC32) != 0;
+#endif
+#elif defined(__NetBSD__)
+ /*
+ * On NetBSD we can read the Instruction Set Attribute Registers via
+ * sysctl. For doubtless-historical reasons the sysctl interface is
+ * completely different on 64-bit than 32-bit, but the underlying
+ * registers contain the same fields.
+ */
+#define ISAR0_CRC32_BITPOS 16
+#define ISAR0_CRC32_BITWIDTH 4
+#define WIDTHMASK(w) ((1 << (w)) - 1)
+#define SYSCTL_CPU_ID_MAXSIZE 64
+
+ size_t len;
+ uint64 sysctlbuf[SYSCTL_CPU_ID_MAXSIZE];
+#if defined(__aarch64__)
+ /* We assume cpu0 is representative of all the machine's CPUs. */
+ const char *path = "machdep.cpu0.cpu_id";
+ size_t expected_len = sizeof(struct aarch64_sysctl_cpu_id);
+#define ISAR0 ((struct aarch64_sysctl_cpu_id *) sysctlbuf)->ac_aa64isar0
+#else
+ const char *path = "machdep.id_isar";
+ size_t expected_len = 6 * sizeof(int);
+#define ISAR0 ((int *) sysctlbuf)[5]
+#endif
+ uint64 fld;
+
+ /* Fetch the appropriate set of register values. */
+ len = sizeof(sysctlbuf);
+ memset(sysctlbuf, 0, len);
+ if (sysctlbyname(path, sysctlbuf, &len, NULL, 0) != 0)
+ return false; /* perhaps kernel is 64-bit and we aren't? */
+ if (len != expected_len)
+ return false; /* kernel API change? */
+
+ /* Fetch the CRC32 field from ISAR0. */
+ fld = (ISAR0 >> ISAR0_CRC32_BITPOS) & WIDTHMASK(ISAR0_CRC32_BITWIDTH);
+
+ /*
+ * Current documentation defines only the field values 0 (No CRC32) and 1
+ * (CRC32B/CRC32H/CRC32W/CRC32X/CRC32CB/CRC32CH/CRC32CW/CRC32CX). Assume
+ * that any future nonzero value will be a superset of 1.
+ */
+ return (fld != 0);
+#else
+ return false;
+#endif
+}
+
+static void
+pg_cpucap_arm(void)
+{
+ if (pg_crc32c_armv8_available()) {
+ pg_cpucap[PG_CPU_FEATURE_ARMV8_CRC32C] = 1;
+ }
+}
+#endif
-/* starts uninitialized so we can detect errors of omission */
-uint32 pg_cpucap = 0;
/*
* This needs to be called in main() for every
* program that calls a function that dispatches
* according to CPU features.
*/
-void
-pg_cpucap_initialize(void)
+void pg_cpucap_initialize(void)
{
- pg_cpucap = PGCPUCAP_INIT;
+ /* Initialize everything to zero */
+ memset(pg_cpucap, 0, sizeof(pg_cpucap[0]) * PG_CPU_FEATURE_MAX);
+ pg_cpucap[PG_CPU_FEATURE_INIT] = 1;
+
+#if defined( __i386__ ) || defined(i386) || defined(_M_IX86) || \
+ defined(__x86_64__) || defined(__x86_64) || defined(_M_AMD64)
+ pg_cpucap_x86();
+#elif defined(__arm__) || defined(__arm) || \
+ defined(__aarch64__) || defined(_M_ARM64)
+ pg_cpucap_arm();
+#endif
- pg_cpucap_crc32c();
- pg_cpucap_clmul();
+}
+
+/* Access to pg_cpucap for modules that need runtime CPUID information */
+bool pg_cpu_have(int feature_id)
+{
+ return pg_cpucap[feature_id];
}
diff --git a/src/port/pg_cpucap_arm.c b/src/port/pg_cpucap_arm.c
deleted file mode 100644
index e080a5a931..0000000000
--- a/src/port/pg_cpucap_arm.c
+++ /dev/null
@@ -1,119 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * pg_cpucap_arm.c
- * Check if the CPU we're running on supports the ARMv8 CRC Extension.
- *
- * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- *
- * IDENTIFICATION
- * src/port/pg_cpucap_arm.c
- *
- *-------------------------------------------------------------------------
- */
-
-#ifndef FRONTEND
-#include "postgres.h"
-#else
-#include "postgres_fe.h"
-#endif
-
-#if defined(HAVE_ELF_AUX_INFO) || defined(HAVE_GETAUXVAL)
-#include <sys/auxv.h>
-#if defined(__linux__) && !defined(__aarch64__) && !defined(HWCAP2_CRC32)
-#include <asm/hwcap.h>
-#endif
-#endif
-
-#if defined(__NetBSD__)
-#include <sys/sysctl.h>
-#if defined(__aarch64__)
-#include <aarch64/armreg.h>
-#endif
-#endif
-
-#include "port/pg_crc32c.h"
-
-static bool
-pg_crc32c_armv8_available(void)
-{
-#if defined(HAVE_ELF_AUX_INFO)
- unsigned long value;
-
-#ifdef __aarch64__
- return elf_aux_info(AT_HWCAP, &value, sizeof(value)) == 0 &&
- (value & HWCAP_CRC32) != 0;
-#else
- return elf_aux_info(AT_HWCAP2, &value, sizeof(value)) == 0 &&
- (value & HWCAP2_CRC32) != 0;
-#endif
-#elif defined(HAVE_GETAUXVAL)
-#ifdef __aarch64__
- return (getauxval(AT_HWCAP) & HWCAP_CRC32) != 0;
-#else
- return (getauxval(AT_HWCAP2) & HWCAP2_CRC32) != 0;
-#endif
-#elif defined(__NetBSD__)
- /*
- * On NetBSD we can read the Instruction Set Attribute Registers via
- * sysctl. For doubtless-historical reasons the sysctl interface is
- * completely different on 64-bit than 32-bit, but the underlying
- * registers contain the same fields.
- */
-#define ISAR0_CRC32_BITPOS 16
-#define ISAR0_CRC32_BITWIDTH 4
-#define WIDTHMASK(w) ((1 << (w)) - 1)
-#define SYSCTL_CPU_ID_MAXSIZE 64
-
- size_t len;
- uint64 sysctlbuf[SYSCTL_CPU_ID_MAXSIZE];
-#if defined(__aarch64__)
- /* We assume cpu0 is representative of all the machine's CPUs. */
- const char *path = "machdep.cpu0.cpu_id";
- size_t expected_len = sizeof(struct aarch64_sysctl_cpu_id);
-#define ISAR0 ((struct aarch64_sysctl_cpu_id *) sysctlbuf)->ac_aa64isar0
-#else
- const char *path = "machdep.id_isar";
- size_t expected_len = 6 * sizeof(int);
-#define ISAR0 ((int *) sysctlbuf)[5]
-#endif
- uint64 fld;
-
- /* Fetch the appropriate set of register values. */
- len = sizeof(sysctlbuf);
- memset(sysctlbuf, 0, len);
- if (sysctlbyname(path, sysctlbuf, &len, NULL, 0) != 0)
- return false; /* perhaps kernel is 64-bit and we aren't? */
- if (len != expected_len)
- return false; /* kernel API change? */
-
- /* Fetch the CRC32 field from ISAR0. */
- fld = (ISAR0 >> ISAR0_CRC32_BITPOS) & WIDTHMASK(ISAR0_CRC32_BITWIDTH);
-
- /*
- * Current documentation defines only the field values 0 (No CRC32) and 1
- * (CRC32B/CRC32H/CRC32W/CRC32X/CRC32CB/CRC32CH/CRC32CW/CRC32CX). Assume
- * that any future nonzero value will be a superset of 1.
- */
- return (fld != 0);
-#else
- return false;
-#endif
-}
-
-/*
- * Check if hardware instructions for CRC computation are available.
- */
-void
-pg_cpucap_crc32c(void)
-{
- if (pg_crc32c_armv8_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-}
-
-void
-pg_cpucap_clmul(void)
-{
- // WIP: does this even make sense?
-}
diff --git a/src/port/pg_cpucap_x86.c b/src/port/pg_cpucap_x86.c
deleted file mode 100644
index 3a62a3a582..0000000000
--- a/src/port/pg_cpucap_x86.c
+++ /dev/null
@@ -1,75 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * pg_cpucap_x86.c
- * Check if the CPU we're running on supports SSE4.2.
- *
- * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- *
- * IDENTIFICATION
- * src/port/pg_cpucap_x86.c
- *
- *-------------------------------------------------------------------------
- */
-
-#include "c.h"
-
-#ifdef HAVE__GET_CPUID
-#include <cpuid.h>
-#endif
-
-#ifdef HAVE__CPUID
-#include <intrin.h>
-#endif
-
-#include "port/pg_cpucap.h"
-
-static bool
-pg_sse42_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
-
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
-}
-
-static bool
-pg_pclmul_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
-
- return (exx[2] & (1 << 1)) != 0; /* PCLMUL */
-}
-
-/*
- * Check if hardware instructions for CRC computation are available.
- */
-void
-pg_cpucap_crc32c(void)
-{
- if (pg_sse42_available())
- pg_cpucap |= PGCPUCAP_CRC32C;
-}
-
-void
-pg_cpucap_clmul(void)
-{
- if (pg_pclmul_available())
- pg_cpucap |= PGCPUCAP_CLMUL;
-}
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index fc3cf0d088..7131b4a326 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -18,6 +18,7 @@
#include <wmmintrin.h>
#include "port/pg_crc32c.h"
+#include "port/pg_cpucap.h"
/* WIP: configure checks */
#ifdef __x86_64__
@@ -140,7 +141,7 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
#ifdef HAVE_PCLMUL_RUNTIME
- if (len >= PCLMUL_THRESHOLD && (pg_cpucap & PGCPUCAP_CLMUL))
+ if (len >= PCLMUL_THRESHOLD && (pg_cpu_have(PG_CPU_FEATURE_PCLMUL)))
{
return pg_comp_crc32c_pclmul(crc, data, len);
}
diff --git a/src/port/pg_popcount_avx512.c b/src/port/pg_popcount_avx512.c
index dac895a0fc..7f5846b1bd 100644
--- a/src/port/pg_popcount_avx512.c
+++ b/src/port/pg_popcount_avx512.c
@@ -14,17 +14,10 @@
#ifdef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
-#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
-#include <cpuid.h>
-#endif
-
#include <immintrin.h>
-#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
-#include <intrin.h>
-#endif
-
#include "port/pg_bitutils.h"
+#include "port/pg_cpucap.h"
/*
* It's probably unlikely that TRY_POPCNT_FAST won't be set if we are able to
@@ -33,63 +26,6 @@
*/
#ifdef TRY_POPCNT_FAST
-/*
- * Does CPUID say there's support for XSAVE instructions?
- */
-static inline bool
-xsave_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
-#else
-#error cpuid instruction not available
-#endif
- return (exx[2] & (1 << 27)) != 0; /* osxsave */
-}
-
-/*
- * Does XGETBV say the ZMM registers are enabled?
- *
- * NB: Caller is responsible for verifying that xsave_available() returns true
- * before calling this.
- */
-#ifdef HAVE_XSAVE_INTRINSICS
-pg_attribute_target("xsave")
-#endif
-static inline bool
-zmm_regs_available(void)
-{
-#ifdef HAVE_XSAVE_INTRINSICS
- return (_xgetbv(0) & 0xe6) == 0xe6;
-#else
- return false;
-#endif
-}
-
-/*
- * Does CPUID say there's support for AVX-512 popcount and byte-and-word
- * instructions?
- */
-static inline bool
-avx512_popcnt_available(void)
-{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID_COUNT)
- __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUIDEX)
- __cpuidex(exx, 7, 0);
-#else
-#error cpuid instruction not available
-#endif
- return (exx[2] & (1 << 14)) != 0 && /* avx512-vpopcntdq */
- (exx[1] & (1 << 30)) != 0; /* avx512-bw */
-}
-
/*
* Returns true if the CPU supports the instructions required for the AVX-512
* pg_popcount() implementation.
@@ -97,9 +33,8 @@ avx512_popcnt_available(void)
bool
pg_popcount_avx512_available(void)
{
- return xsave_available() &&
- zmm_regs_available() &&
- avx512_popcnt_available();
+ return pg_cpu_have(PG_CPU_FEATURE_AVX512VPOPCNTDQ) &&
+ pg_cpu_have(PG_CPU_FEATURE_AVX512BW);
}
/*
--
2.43.0
On Thu, Feb 27, 2025 at 3:53 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
I'm not a fan of exposing these architecture-specific details to places that consult
the capabilities. That requires things like+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_SSE42) [...] +#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_ARMV8_CRC32C)I'd prefer to have 1 capability <-> one place to check at runtime for architectures
that need that, and to keep architecture details private to the initialization step. .
Even for things that test for which function pointer to use, I think it's a cleaner
interface to look at one thing.Isn't that one thing currently pg_cpu_have(FEATURE)?
No, I mean the one thing would be "do we have hardware CRC?", and each
architecture would set (and check if needed) the same slot.
I'm not 100% sure this is the better way, and there may be a
disadvantage I haven't thought of, but this makes the most sense to
me. I'm willing to hear reasons to do it differently, but I'm thinking
those reasons need to be weighed against the loss of abstraction.
The reason we have the additional PGCPUCAP_CRC32C is because the dispatch for CRC32C is currently defined in the header file common to both ARM and SSE42. We should ideally have separate dispatch for ARM and x86 in their own files (the way it is in the main branch).
Just so we're on the same page, the current separate files do
initialization, not dispatch. I'm seeing conflicting design goals
here:
1. Have multiple similar dispatch functions with the only difference
(for now) being certain names (not in the patch, just discussed).
2. Put initialization routines in the same file, even though they have
literally nothing in common.
This also makes it easier to add more implementations in the future without having to make the dispatch function work for both ARM and x86.
I don't quite understand, since with 0001 it already works (at least
in theory) for 3 architectures with hardware CRC, plus those without,
and I don't see it changing with more architectures.
+/* Access to pg_cpucap for modules that need runtime CPUID information */
+bool pg_cpu_have(int feature_id)
+{
+ return pg_cpucap[feature_id];
}
Putting this in a separate translation may have to do the decision to
turn v8's #defines into an enum? In any case, this means doing a
function call to decide which function to call. That makes no sense.
The goal of v9/10 was to centralize the initialization from cpuid etc,
but it also did a major re-architecture of the runtime-checks at the
same. That made review more difficult.
+#if defined(__x86_64__) || defined ...
+ pg_cpucap_x86();
+#elif defined(__aarch64__) || defined ...
+ pg_cpucap_arm();
+#endif
I view this another conflict in design goals: After putting everything
in the same file, the routines still have different names and have to
be guarded accordingly. I don't really see the advantage, especially
since these guard symbols don't even match the ones that guard the
actual initialization steps. I tried to imply that in my last review,
but maybe I should have been more explicit.
I think the least painful step is to take the x86 initialization from
v10, which is looking great, but
- keep separate initialization files
- don't whack around the runtime representation, at least not in the same patch
--
John Naylor
Amazon Web Services
Hi John,
Going back to your earlier comment:
I'm not a fan of exposing these architecture-specific details to
places that consult the capabilities. That requires things like
+#define PGCPUCAP_CRC32C pg_cpu_have(PG_CPU_FEATURE_SSE42)
Isn't that one thing currently pg_cpu_have(FEATURE)?
No, I mean the one thing would be "do we have hardware CRC?", and each
architecture would set (and check if needed) the same slot.I'm not 100% sure this is the better way, and there may be a disadvantage I
haven't thought of, but this makes the most sense to me. I'm willing to hear
reasons to do it differently, but I'm thinking those reasons need to be weighed
against the loss of abstraction.
IMO, CPU SIMD (SSE4.2, AVX, etc.) features are a module of their own separate from capabilities/features supported in postgres (CRC32C, bitcount, etc.). Exposing architecture-specific details to the features that need to check against them is a way to make code more modular and reusable. It is reasonable to expect developers writing SIMD specific functions to naturally understand the runtime architecture requirements for that function which is easily accessible by just checking the value of pg_cpu_have(PG_CPU_FEATURE_..). If another feature in postgres requires SSE4.2, then the CPU initialization module doesn't need to be modified at all. They just have to worry about their feature and its CPU requirements.
Just so we're on the same page, the current separate files do initialization, not
dispatch. I'm seeing conflicting design goals
here:1. Have multiple similar dispatch functions with the only difference (for now)
being certain names (not in the patch, just discussed).
2. Put initialization routines in the same file, even though they have literally
nothing in common.
Sorry for not being clear. I will move the initialization routines to separate files. Just haven’t gotten to it yet with v10.
This also makes it easier to add more implementations in the future without
having to make the dispatch function work for both ARM and x86.
I don't quite understand, since with 0001 it already works (at least in theory) for 3
architectures with hardware CRC, plus those without, and I don't see it changing
with more architectures.
The reason its working across all architectures as of now is because there is only one runtime check for CRC32C feature across x86, ARM and LongArch. (PCLMUL dispatch happens inside the SSE42). If and when we add the AVX-512 implementation, won't the dispatch logic will look different for x86 and ARM? Along with that, the header file pg_crc32c.h will also look a lot messier with a whole bunch of nested #ifdefs. IMO, the header file should be devoid of any architecture dispatch logic and simply contain declarations of various implementations (see https://github.com/r-devulap/postgressql/blob/cf80f7375f24d2fb5cd29e95dc73d4988fc6d066/src/include/port/pg_crc32c.h for example). The dispatch logic should be handled separately in a C file.
+/* Access to pg_cpucap for modules that need runtime CPUID information +*/ bool pg_cpu_have(int feature_id) { + return pg_cpucap[feature_id]; }Putting this in a separate translation may have to do the decision to turn v8's
#defines into an enum? In any case, this means doing a function call to decide
which function to call. That makes no sense.
The reason it is a separate translational unit is because I intended pg_cpucap to be a read only variable outside of pg_cpucap.c. If the function overhead is not preferred, then I can get rid of it.
The goal of v9/10 was to centralize the initialization from cpuid etc, but it also did
a major re-architecture of the runtime-checks at the same. That made review
more difficult.
I assumed we wanted to move the runtime checks to a central place and that needed this re-architecture.
+#if defined(__x86_64__) || defined ... + pg_cpucap_x86(); +#elif defined(__aarch64__) || defined ... + pg_cpucap_arm(); +#endifI view this another conflict in design goals: After putting everything in the same
file, the routines still have different names and have to be guarded accordingly. I
don't really see the advantage, especially since these guard symbols don't even
match the ones that guard the actual initialization steps. I tried to imply that in
my last review, but maybe I should have been more explicit.I think the least painful step is to take the x86 initialization from v10, which is
looking great, but
- keep separate initialization files
- don't whack around the runtime representation, at least not in the same patch
Sure. I can fix that in v11.
Raghuveer
Hi Raghuveer,
You raised some interesting points, which deserve a thoughtful
response. After sleeping on it, however I came to the conclusion that
a sweeping change in runtime checks, with either of our approaches,
has downsides and unresolved questions. Perhaps we can come back to it
at a later time. For this release cycle, I took a step back and tried
to think of the least invasive way to solve the immediate problem,
which is: How to allow existing builds with "-msse4.2" to take
advantage of CLMUL while not adding overhead. Here's what I came up
with in v11:
0001: same benchmark module as before
0002: For SSE4.2 builds, arrange so that constant input uses an
inlined path so that the compiler can emit unrolled loops anywhere.
This is particularly important for the WAL insertion lock, so this is
possibly committable on its own just for that.
0003: the PCLMUL path, only for runtime-check builds
0004: the PCLMUL path for SSE4.2 builds. This uses a function pointer
for long-ish input and the same above inlined path for short input
(whether constant or not). So it gets the best of both worlds.
There is also a separate issue:
On Tue, Feb 25, 2025 at 6:05 PM John Naylor <johncnaylorls@gmail.com> wrote:
Another thing I found in Agner's manuals: AMD Zen, even as recently as
Zen 4, don't have as good a microarchitecture for PCLMUL, so if anyone
with such a machine would like to help test the cutoff
David Rowley shared some results off-list, which are: Zen 4 is very
good with this algorithm even at 64 bytes input length, but Zen 2
regresses up to maybe 256 bytes. Having a large cutoff to cover all
bases makes this less useful, and that was one of my reservations
about AVX-512. However, with the corsix generator I found it's
possible to specify AVX-512 with a single accumulator (rather than 4),
which still gives a minimum input of 64 bytes, so I'll plan to put
something together to demonstrate.
(Older machines could use the 3-way stream algorithm as a fallback on
long inputs, as I've mentioned elsewhere, assuming that's legally
unencumbered.)
--
John Naylor
Amazon Web Services
Attachments:
v11-0004-Use-runtime-check-even-when-we-have-SSE-4.2-at-c.patchtext/x-patch; charset=US-ASCII; name=v11-0004-Use-runtime-check-even-when-we-have-SSE-4.2-at-c.patchDownload
From f96ae8bf6913796c5d724ff9da25bbc79927ff1c Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Fri, 28 Feb 2025 18:27:40 +0700
Subject: [PATCH v11 4/4] Use runtime check even when we have SSE 4.2 at
compile time
This allows us to use PCLMUL for longer inputs. Short inputs are
inlined to avoid the indirection through a function pointer.
---
configure | 2 +-
configure.ac | 2 +-
src/include/port/pg_crc32c.h | 15 +++++++++++----
src/port/meson.build | 1 +
src/port/pg_crc32c_sse42_choose.c | 2 ++
5 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/configure b/configure
index 93fddd69981..91c0ffc8272 100755
--- a/configure
+++ b/configure
@@ -17684,7 +17684,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
$as_echo "SSE 4.2" >&6; }
else
diff --git a/configure.ac b/configure.ac
index b6d02f5ecc7..a85bdbd4ff6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2151,7 +2151,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index fe0e1b6b275..26b676dddc9 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -55,22 +55,29 @@ typedef uint32 pg_crc32c;
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
static inline
pg_crc32c
pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
{
- if (__builtin_constant_p(len) && len < 64)
+ if (len < 64)
{
/*
- * For small constant inputs, inline the computation. This allows the
- * compiler to unroll loops.
+ * For small inputs, inline the computation to avoid the runtime
+ * check. This also allows the compiler to unroll loops for constant
+ * input.
*/
return pg_comp_crc32c_sse42_inline(crc, data, len);
}
else
- return pg_comp_crc32c_sse42(crc, data, len);
+ /* For larger inputs, use a runtime check for PCLMUL instructions. */
+ return pg_comp_crc32c(crc, data, len);
}
#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d43..8d70a4d510e 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -83,6 +83,7 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index abea0f90eb3..89a48c76894 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -55,8 +55,10 @@ pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
pg_comp_crc32c = pg_comp_crc32c_pclmul;
#endif
}
+#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
+#endif
return pg_comp_crc32c(crc, data, len);
}
--
2.48.1
v11-0003-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v11-0003-Improve-CRC32C-performance-on-x86_64.patchDownload
From 9bf92b1ee09810ae734e6463fe01ad9858fa7168 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v11 3/4] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
src/include/port/pg_crc32c.h | 30 ++++++++---
src/port/pg_crc32c_sse42.c | 88 +++++++++++++++++++++++++++++++
src/port/pg_crc32c_sse42_choose.c | 26 ++++-----
3 files changed, 124 insertions(+), 20 deletions(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 5ccc79295c0..fe0e1b6b275 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -37,6 +37,11 @@
typedef uint32 pg_crc32c;
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define USE_PCLMUL_WITH_RUNTIME_CHECK
+#endif
+
/* The INIT and EQ macros are the same for all implementations. */
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
@@ -68,6 +73,23 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
return pg_comp_crc32c_sse42(crc, data, len);
}
+#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+
+/*
+ * Use Intel SSE 4.2 or PCLMUL instructions, but perform a runtime check first
+ * to check that they are available.
+ */
+#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c((crc), (data), (len)))
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
@@ -86,7 +108,7 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+#elif defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/*
* Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
@@ -98,13 +120,7 @@ extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
#else
/*
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 6a35f7fdc67..b56da2f6934 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,6 +15,7 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
#include "port/pg_crc32c_sse42_impl.h"
@@ -26,3 +27,90 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
{
return pg_comp_crc32c_sse42_inline(crc, data, len);
}
+
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ // This prolog is trying to avoid loads straddling
+ // cache lines, but it doesn't seem worth it if
+ // we're trying to be fast on small inputs as well
+#if 0
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t) buf & 8) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+#endif
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ len = end - buf;
+ }
+
+ return pg_comp_crc32c_sse42_inline(crc0, buf, len);
+}
+
+#endif
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d4249..abea0f90eb3 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -30,8 +30,12 @@
#include "port/pg_crc32c.h"
-static bool
-pg_crc32c_sse42_available(void)
+/*
+ * This gets called on the first call. It replaces the function pointer
+ * so that subsequent calls are routed directly to the chosen implementation.
+ */
+static pg_crc32c
+pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -43,18 +47,14 @@ pg_crc32c_sse42_available(void)
#error cpuid instruction not available
#endif
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
-}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
+ if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
+ {
pg_comp_crc32c = pg_comp_crc32c_sse42;
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+ if ((exx[2] & (1 << 1)) != 0) /* PCLMUL */
+ pg_comp_crc32c = pg_comp_crc32c_pclmul;
+#endif
+ }
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
--
2.48.1
v11-0002-Inline-CRC-computation-for-small-fixed-length-in.patchtext/x-patch; charset=US-ASCII; name=v11-0002-Inline-CRC-computation-for-small-fixed-length-in.patchDownload
From 7cd98c95678752dac778c3e4c36cc3ae8d93639d Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Fri, 28 Feb 2025 16:27:30 +0700
Subject: [PATCH v11 2/4] Inline CRC computation for small fixed-length input
---
src/include/port/pg_crc32c.h | 21 ++++++-
src/include/port/pg_crc32c_sse42_impl.h | 74 +++++++++++++++++++++++++
src/port/pg_crc32c_sse42.c | 46 +--------------
3 files changed, 96 insertions(+), 45 deletions(-)
create mode 100644 src/include/port/pg_crc32c_sse42_impl.h
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b1..5ccc79295c0 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -43,12 +43,31 @@ typedef uint32 pg_crc32c;
#if defined(USE_SSE42_CRC32C)
/* Use Intel SSE4.2 instructions. */
+
+#include "pg_crc32c_sse42_impl.h"
+
#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+static inline
+pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ if (__builtin_constant_p(len) && len < 64)
+ {
+ /*
+ * For small constant inputs, inline the computation. This allows the
+ * compiler to unroll loops.
+ */
+ return pg_comp_crc32c_sse42_inline(crc, data, len);
+ }
+ else
+ return pg_comp_crc32c_sse42(crc, data, len);
+}
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
diff --git a/src/include/port/pg_crc32c_sse42_impl.h b/src/include/port/pg_crc32c_sse42_impl.h
new file mode 100644
index 00000000000..e10ad777618
--- /dev/null
+++ b/src/include/port/pg_crc32c_sse42_impl.h
@@ -0,0 +1,74 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_crc32c_sse42_impl.h
+ * Inline implementation of CRC computation using SSE 4.2
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/port/pg_crc32c_sse42_impl.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CRC32C_SSE42_IMPL_H
+#define PG_CRC32C_SSE42_IMPL_H
+
+#include "c.h"
+
+#include <nmmintrin.h>
+
+pg_attribute_no_sanitize_alignment()
+pg_attribute_target("sse4.2")
+static inline
+pg_crc32c
+pg_comp_crc32c_sse42_inline(pg_crc32c crc, const void *data, size_t len)
+{
+ const unsigned char *p = data;
+ const unsigned char *pend = p + len;
+
+ /*
+ * Process eight bytes of data at a time.
+ *
+ * NB: We do unaligned accesses here. The Intel architecture allows that,
+ * and performance testing didn't show any performance gain from aligning
+ * the begin address.
+ */
+#ifdef __x86_64__
+ while (p + 8 <= pend)
+ {
+ crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
+ p += 8;
+ }
+
+ /* Process remaining full four bytes if any */
+ if (p + 4 <= pend)
+ {
+ crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
+ p += 4;
+ }
+#else
+
+ /*
+ * Process four bytes at a time. (The eight byte instruction is not
+ * available on the 32-bit x86 architecture).
+ */
+ while (p + 4 <= pend)
+ {
+ crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
+ p += 4;
+ }
+#endif /* __x86_64__ */
+
+ /* Process any remaining bytes one at a time. */
+ while (p < pend)
+ {
+ crc = _mm_crc32_u8(crc, *p);
+ p++;
+ }
+
+ return crc;
+}
+
+#endif /* PG_CRC32C_SSE42_IMPL_H */
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df31..6a35f7fdc67 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -17,54 +17,12 @@
#include <nmmintrin.h>
#include "port/pg_crc32c.h"
+#include "port/pg_crc32c_sse42_impl.h"
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
{
- const unsigned char *p = data;
- const unsigned char *pend = p + len;
-
- /*
- * Process eight bytes of data at a time.
- *
- * NB: We do unaligned accesses here. The Intel architecture allows that,
- * and performance testing didn't show any performance gain from aligning
- * the begin address.
- */
-#ifdef __x86_64__
- while (p + 8 <= pend)
- {
- crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
- p += 8;
- }
-
- /* Process remaining full four bytes if any */
- if (p + 4 <= pend)
- {
- crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
- p += 4;
- }
-#else
-
- /*
- * Process four bytes at a time. (The eight byte instruction is not
- * available on the 32-bit x86 architecture).
- */
- while (p + 4 <= pend)
- {
- crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
- p += 4;
- }
-#endif /* __x86_64__ */
-
- /* Process any remaining bytes one at a time. */
- while (p < pend)
- {
- crc = _mm_crc32_u8(crc, *p);
- p++;
- }
-
- return crc;
+ return pg_comp_crc32c_sse42_inline(crc, data, len);
}
--
2.48.1
v11-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchDownload
From f176a0cfadcb0a9971746f23677de7d4278a5391 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v11 1/4] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67a..06673db0625 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 00000000000..5b747c6184a
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 00000000000..dff6bb3133b
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 00000000000..d7bec4ba1cb
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 00000000000..95c6dfe4488
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 00000000000..52b9772f908
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 00000000000..28bc42de314
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 00000000000..878a077ee18
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
On Fri, Feb 28, 2025 at 07:11:29PM +0700, John Naylor wrote:
0002: For SSE4.2 builds, arrange so that constant input uses an
inlined path so that the compiler can emit unrolled loops anywhere.
This is particularly important for the WAL insertion lock, so this is
possibly committable on its own just for that.
Nice.
0004: the PCLMUL path for SSE4.2 builds. This uses a function pointer
for long-ish input and the same above inlined path for short input
(whether constant or not). So it gets the best of both worlds.
I spent some time staring at pg_crc32.h with all these patches applied, and
IIUC it leads to the following behavior:
* For compiled-in SSE 4.2 builds, we branch based on the length. For
smaller inputs, we are using an inlined version of the SSE 4.2 code.
For larger inputs, we call a function pointer so that we can potentially
use the PCLMUL version. This could potentially lead to a small
regression for machines with SSE 4.2 but not PCLMUL, but that may be
uncommon enough at this point to not worry aobut.
* For runtime-check SSE 4.2 builds, we choose slicing-by-8, SSE 4.2, or
SSE 4.2 with PCLMUL, and we always use a function pointer.
The main question I have is whether we can simplify this by always using a
runtime check and by inlining slicing-by-8 for small inputs. That would be
dependent on the performance of slicing-by-8 and SSE 4.2 being comparable
for small inputs.
Overall, I wish we could avoid splitting things into separate files and
adding more header file gymnastics, but maybe there isn't much better we
can do without overhauling the CPU feature detection code.
--
nathan
Hi John,
You raised some interesting points, which deserve a thoughtful response. After
sleeping on it, however I came to the conclusion that a sweeping change in
runtime checks, with either of our approaches, has downsides and unresolved
questions. Perhaps we can come back to it at a later time.
Sounds good to me. I did get derailed into something beyond the scope of this patch. I will make a separate proposal once this is merged.
Here's what I came up with in v11:
Some feedback on v11:
if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
{
pg_comp_crc32c = pg_comp_crc32c_sse42;
#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
if ((exx[2] & (1 << 1)) != 0) /* PCLMUL */
pg_comp_crc32c = pg_comp_crc32c_pclmul;
#endif
}
#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
#endif
Is the #ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK at the right place? Shouldn’t it guard SSE4.2 function pointer assignment?
/* WIP: configure checks */
#ifdef __x86_64__
#define USE_PCLMUL_WITH_RUNTIME_CHECK
#endif
Minor consideration: gcc 4.3 (released in 2011) is the only compiler that supports -msse4.2 and not -mpclmul. gcc >= 4.4 supports both. If you are okay causing a regression on gcc4.3, we could combine USE_PCLMUL_WITH_RUNTIME_CHECK with USE_SSE42_CRC32C_WITH_RUNTIME_CHECK into a single macro to reduce the number of #ifdef's in the codebase and simplify configure/meson compiler checks.
0001: same benchmark module as before
0002: For SSE4.2 builds, arrange so that constant input uses an inlined path so
that the compiler can emit unrolled loops anywhere.
When building with meson, it looks like we build with -O2 and that is not enough for the compiler to unroll the SSE42 CRC32C loop. It requires -O3 or -O2 with -funroll-loops (see https://gcc.godbolt.org/z/4Eaq981aT). Perhaps we should check disassembly to see if the unroll is really happening on constant input?
Also, the reason you have pg_comp_crc32c_sse42_inline defined separately in a header file is because you want to (a) inline the function and (b) unroll for constant inputs. Couldn't both of these be achieved by adding function __attribute__((always_inline)) on the pg_comp_crc32c_sse42 function with the added -funroll-loops compiler flag?
Raghuveer
On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
I spent some time staring at pg_crc32.h with all these patches applied, and
IIUC it leads to the following behavior:* For compiled-in SSE 4.2 builds, we branch based on the length. For
smaller inputs, we are using an inlined version of the SSE 4.2 code.
For larger inputs, we call a function pointer so that we can potentially
use the PCLMUL version.
Right. For WAL, my hope is that the inlined path would balance out the
path with the function pointer, particularly since the computation for
the 20-byte header would be both inlined and unrolled, as I see here
in XLogInsertRecord():
crc32 rax,QWORD PTR [rsi]
crc32 rax,rbx ; <- newly calculated xl_prev
crc32 eax,DWORD PTR [rsi+0x10]
This could potentially lead to a small
regression for machines with SSE 4.2 but not PCLMUL, but that may be
uncommon enough at this point to not worry aobut.
Note also upthread I mentioned we may have to go to 512-bit pclmul,
since Zen 2 regresses on 128-bit. :-(
I actually haven't seen any measurable difference with direct calls
versus indirect, but it could very well be that the microbenchmark is
hiding that since it's doing something unnatural by calling things a
bunch of times in a loop. I want to try changing the benchmark to base
the address it's computing on some bits from the crc from the last
loop iteration. I think that would make it more latency-sensitive. We
could also make it do an additional constant 20-byte input every time
to make it resemble WAL more closely.
* For runtime-check SSE 4.2 builds, we choose slicing-by-8, SSE 4.2, or
SSE 4.2 with PCLMUL, and we always use a function pointer.The main question I have is whether we can simplify this by always using a
runtime check and by inlining slicing-by-8 for small inputs. That would be
dependent on the performance of slicing-by-8 and SSE 4.2 being comparable
for small inputs.
Slicing-by-8 needs one lookup and one XOR per byte of input, and other
overheads, so I think it would still be very slow.
Overall, I wish we could avoid splitting things into separate files and
adding more header file gymnastics, but maybe there isn't much better we
can do without overhauling the CPU feature detection code.
Yeah, it seems all ideas so far have something unattractive about them. :-(
--
John Naylor
Amazon Web Services
On Tue, Mar 4, 2025 at 5:41 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
Some feedback on v11:
if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
{
pg_comp_crc32c = pg_comp_crc32c_sse42;
#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
if ((exx[2] & (1 << 1)) != 0) /* PCLMUL */
pg_comp_crc32c = pg_comp_crc32c_pclmul;
#endif
}
#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
#endifIs the #ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK at the right place? Shouldn’t it guard SSE4.2 function pointer assignment?
Without this guard, SSE builds will complain during link time of an
undefined reference to pg_comp_crc32c_sb8. We could instead just build
that file everywhere for simplicity, but it'll just take up space and
never get called. (Maybe that's okay because with the
runtime-branching approach we experimented with earlier, it would
always have to be built.)
When building with meson, it looks like we build with -O2 and that is not enough for the compiler to unroll the SSE42 CRC32C loop.It requires -O3 or -O2 with -funroll-loops (see https://gcc.godbolt.org/z/4Eaq981aT). Perhaps we should check disassembly to see if the unroll is really happening on constant input?
That example uses 32 bytes -- fiddling with it a bit, for 31 or less
it gets unrolled. The case we care about most is currently 20 bytes.
We don't want to force unrolling loops in the general case -- that's
normally used for longer input and this is for short input.
Also, the reason you have pg_comp_crc32c_sse42_inline defined separately in a header file is because you want to (a) inline the function and (b) unroll for constant inputs. Couldn't both of these be achieved by adding function __attribute__((always_inline)) on the pg_comp_crc32c_sse42 function with the added -funroll-loops compiler flag?
And (c) prevent it from being inlined in builds that need a runtime check.
I briefly tried the attribute approach and it doesn't work for me. If
you can get it to work, go ahead and share how that's done, but keep
in mind that we're not gcc/clang only -- it also has to work for
MSVC's "__forceinline"...
--
John Naylor
Amazon Web Services
On Tue, Mar 04, 2025 at 12:09:09PM +0700, John Naylor wrote:
On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
This could potentially lead to a small regression for machines with SSE
4.2 but not PCLMUL, but that may be uncommon enough at this point to not
worry aobut.Note also upthread I mentioned we may have to go to 512-bit pclmul,
since Zen 2 regresses on 128-bit. :-(
Ah, okay. You mean the AVX-512 version [0]/messages/by-id/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com? And are you thinking we'd use
the same strategy for the compiled-in-SSE4.2 builds, i.e., inline the
SSE4.2 version for small inputs and use a function pointer for larger ones?
I actually haven't seen any measurable difference with direct calls
versus indirect, but it could very well be that the microbenchmark is
hiding that since it's doing something unnatural by calling things a
bunch of times in a loop. I want to try changing the benchmark to base
the address it's computing on some bits from the crc from the last
loop iteration. I think that would make it more latency-sensitive. We
could also make it do an additional constant 20-byte input every time
to make it resemble WAL more closely.
Looking back on some old benchmarks for small-ish inputs [0]/messages/by-id/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com, the
difference does seem within the noise range. I suppose these functions
might be expensive enough to make the function pointer overhead negligible.
IME there's a big difference when a function pointer is used for an
instruction or two [2]/messages/by-id/CAApHDvqyMNGVgwpaOPtENdq5uEMR=vSkRJEgG1S+X7Vtk1-EnA@mail.gmail.com, but even relatively small inputs to the CRC-32C
functions might require several instructions.
The main question I have is whether we can simplify this by always using a
runtime check and by inlining slicing-by-8 for small inputs. That would be
dependent on the performance of slicing-by-8 and SSE 4.2 being comparable
for small inputs.Slicing-by-8 needs one lookup and one XOR per byte of input, and other
overheads, so I think it would still be very slow.
That's my suspicion, too.
[0]: /messages/by-id/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com
[1]: /messages/by-id/20231031033601.GA68409@nathanxps13
[2]: /messages/by-id/CAApHDvqyMNGVgwpaOPtENdq5uEMR=vSkRJEgG1S+X7Vtk1-EnA@mail.gmail.com
--
nathan
On Wed, Mar 5, 2025 at 12:36 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Tue, Mar 04, 2025 at 12:09:09PM +0700, John Naylor wrote:
On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
This could potentially lead to a small regression for machines with SSE
4.2 but not PCLMUL, but that may be uncommon enough at this point to not
worry aobut.Note also upthread I mentioned we may have to go to 512-bit pclmul,
since Zen 2 regresses on 128-bit. :-(Ah, okay. You mean the AVX-512 version [0]?
Right, except not that version, rather a more efficient way and with
only one accumulator, so still a minimum length of 64 bytes. I'll
share that once we have agreement on detection/dispatch.
And are you thinking we'd use
the same strategy for the compiled-in-SSE4.2 builds, i.e., inline the
SSE4.2 version for small inputs and use a function pointer for larger ones?
Yes. Although, we may not even have to inline for non-constant input,
see below. Inlining loops does take binary space.
I actually haven't seen any measurable difference with direct calls
versus indirect, but it could very well be that the microbenchmark is
hiding that since it's doing something unnatural by calling things a
bunch of times in a loop. I want to try changing the benchmark to base
the address it's computing on some bits from the crc from the last
loop iteration. I think that would make it more latency-sensitive. We
could also make it do an additional constant 20-byte input every time
to make it resemble WAL more closely.Looking back on some old benchmarks for small-ish inputs [0], the
difference does seem within the noise range. I suppose these functions
might be expensive enough to make the function pointer overhead negligible.
IME there's a big difference when a function pointer is used for an
instruction or two [2], but even relatively small inputs to the CRC-32C
functions might require several instructions.
That was my hunch too, but I wanted to be more sure, so I modified the
benchmark so it doesn't know the address of the next calculation until
it finishes the last calculation so we can hopefully see the latency
caused by indirection. It also does an additional calculation on
constant 20 bytes, like the WAL header. I also tweaked the length each
iteration so the branch predictor maybe has a harder time predicting
the constant 20 input. And to make it more challenging, I removed the
part that inlined all small inputs, so it inlines only constant
inputs:
0001+0002 (test only)
func pointer:
32
latency average = 24.021 ms
latency average = 24.020 ms
latency average = 23.733 ms
40
latency average = 25.018 ms
latency average = 24.253 ms
latency average = 24.278 ms
48
latency average = 25.437 ms
latency average = 24.817 ms
latency average = 24.793 ms
SSE4.2 build (direct func):
32
latency average = 22.422 ms
latency average = 22.387 ms
latency average = 22.391 ms
40
latency average = 23.444 ms
latency average = 22.887 ms
latency average = 22.988 ms
48
latency average = 23.432 ms
latency average = 23.380 ms
latency average = 23.384 ms
0001-0006
SSE 4.2 build (inlined constant / otherwise func pointer)
32
latency average = 22.135 ms
latency average = 21.874 ms
latency average = 21.910 ms
40
latency average = 22.916 ms
latency average = 23.086 ms
latency average = 22.422 ms
48
latency average = 23.255 ms
latency average = 22.780 ms
latency average = 22.804 ms
These are still a bit noisy, and close, but, it seems there is no
penalty in using the function pointer as long as the header
calculation is inlined.
--
John Naylor
Amazon Web Services
On Wed, Mar 05, 2025 at 08:51:21AM +0700, John Naylor wrote:
That was my hunch too, but I wanted to be more sure, so I modified the
benchmark so it doesn't know the address of the next calculation until
it finishes the last calculation so we can hopefully see the latency
caused by indirection. It also does an additional calculation on
constant 20 bytes, like the WAL header. I also tweaked the length each
iteration so the branch predictor maybe has a harder time predicting
the constant 20 input. And to make it more challenging, I removed the
part that inlined all small inputs, so it inlines only constant
inputs:
Would you mind sharing this test? It sounds like you are running a
workload with a mix of constant/inlined calls and function pointer calls to
simulate typical usage for WAL, but I'm not 100% sure I'm understanding you
correctly.
These are still a bit noisy, and close, but, it seems there is no
penalty in using the function pointer as long as the header
calculation is inlined.
These results look promising.
--
nathan
On Wed, Mar 5, 2025 at 10:52 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Mar 05, 2025 at 08:51:21AM +0700, John Naylor wrote:
That was my hunch too, but I wanted to be more sure, so I modified the
benchmark so it doesn't know the address of the next calculation until
it finishes the last calculation so we can hopefully see the latency
caused by indirection. It also does an additional calculation on
constant 20 bytes, like the WAL header. I also tweaked the length each
iteration so the branch predictor maybe has a harder time predicting
the constant 20 input. And to make it more challenging, I removed the
part that inlined all small inputs, so it inlines only constant
inputs:Would you mind sharing this test?
The test script is the same as here, except I only ran small lengths:
/messages/by-id/CANWCAZahvhE-+htZiUyzPiS5e45ukx5877mD-dHr-KSX6LcdjQ@mail.gmail.com
...but I must have forgotten to attach the slightly tweaked patch set,
which I've done now. 0002 modifies the 0001 test module and 0006
reverts inlining non-constant input from 0005, just to see if I could
find a regression from indirection, which I didn't. If we don't need
it, it'd better to avoid inlining loops to keep from bloating the
binary.
It sounds like you are running a
workload with a mix of constant/inlined calls and function pointer calls to
simulate typical usage for WAL, but I'm not 100% sure I'm understanding you
correctly.
Exactly.
--
John Naylor
Amazon Web Services
Attachments:
v12-0006-Only-inline-for-constant-input-partial-revert.patchtext/x-patch; charset=US-ASCII; name=v12-0006-Only-inline-for-constant-input-partial-revert.patchDownload
From c5cd6e44028eaf11efc2cf4fc49c87101b49c97f Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 5 Mar 2025 08:21:54 +0700
Subject: [PATCH v12 6/6] Only inline for constant input (partial revert)
---
src/include/port/pg_crc32c.h | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 26b676dddc9..01192831ca3 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -66,12 +66,11 @@ static inline
pg_crc32c
pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
{
- if (len < 64)
+ if (__builtin_constant_p(len) && len < 64)
{
/*
- * For small inputs, inline the computation to avoid the runtime
- * check. This also allows the compiler to unroll loops for constant
- * input.
+ * For small constant inputs, inline the computation. This allows the
+ * compiler to unroll loops.
*/
return pg_comp_crc32c_sse42_inline(crc, data, len);
}
--
2.48.1
v12-0004-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v12-0004-Improve-CRC32C-performance-on-x86_64.patchDownload
From eafea75fc761fd51fa67311af794cf0f7dec40aa Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v12 4/6] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
src/include/port/pg_crc32c.h | 30 ++++++++---
src/port/pg_crc32c_sse42.c | 88 +++++++++++++++++++++++++++++++
src/port/pg_crc32c_sse42_choose.c | 26 ++++-----
3 files changed, 124 insertions(+), 20 deletions(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 5ccc79295c0..fe0e1b6b275 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -37,6 +37,11 @@
typedef uint32 pg_crc32c;
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define USE_PCLMUL_WITH_RUNTIME_CHECK
+#endif
+
/* The INIT and EQ macros are the same for all implementations. */
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
@@ -68,6 +73,23 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
return pg_comp_crc32c_sse42(crc, data, len);
}
+#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+
+/*
+ * Use Intel SSE 4.2 or PCLMUL instructions, but perform a runtime check first
+ * to check that they are available.
+ */
+#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c((crc), (data), (len)))
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
@@ -86,7 +108,7 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+#elif defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/*
* Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
@@ -98,13 +120,7 @@ extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
#else
/*
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 6a35f7fdc67..b56da2f6934 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,6 +15,7 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
#include "port/pg_crc32c_sse42_impl.h"
@@ -26,3 +27,90 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
{
return pg_comp_crc32c_sse42_inline(crc, data, len);
}
+
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ // This prolog is trying to avoid loads straddling
+ // cache lines, but it doesn't seem worth it if
+ // we're trying to be fast on small inputs as well
+#if 0
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t) buf & 8) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+#endif
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ len = end - buf;
+ }
+
+ return pg_comp_crc32c_sse42_inline(crc0, buf, len);
+}
+
+#endif
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d4249..abea0f90eb3 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -30,8 +30,12 @@
#include "port/pg_crc32c.h"
-static bool
-pg_crc32c_sse42_available(void)
+/*
+ * This gets called on the first call. It replaces the function pointer
+ * so that subsequent calls are routed directly to the chosen implementation.
+ */
+static pg_crc32c
+pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -43,18 +47,14 @@ pg_crc32c_sse42_available(void)
#error cpuid instruction not available
#endif
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
-}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
+ if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
+ {
pg_comp_crc32c = pg_comp_crc32c_sse42;
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+ if ((exx[2] & (1 << 1)) != 0) /* PCLMUL */
+ pg_comp_crc32c = pg_comp_crc32c_pclmul;
+#endif
+ }
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
--
2.48.1
v12-0003-Inline-CRC-computation-for-small-fixed-length-in.patchtext/x-patch; charset=US-ASCII; name=v12-0003-Inline-CRC-computation-for-small-fixed-length-in.patchDownload
From ebbd072d558574f78bd4489c3431a13fd831f254 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Fri, 28 Feb 2025 16:27:30 +0700
Subject: [PATCH v12 3/6] Inline CRC computation for small fixed-length input
---
src/include/port/pg_crc32c.h | 21 ++++++-
src/include/port/pg_crc32c_sse42_impl.h | 74 +++++++++++++++++++++++++
src/port/pg_crc32c_sse42.c | 46 +--------------
3 files changed, 96 insertions(+), 45 deletions(-)
create mode 100644 src/include/port/pg_crc32c_sse42_impl.h
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b1..5ccc79295c0 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -43,12 +43,31 @@ typedef uint32 pg_crc32c;
#if defined(USE_SSE42_CRC32C)
/* Use Intel SSE4.2 instructions. */
+
+#include "pg_crc32c_sse42_impl.h"
+
#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+static inline
+pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ if (__builtin_constant_p(len) && len < 64)
+ {
+ /*
+ * For small constant inputs, inline the computation. This allows the
+ * compiler to unroll loops.
+ */
+ return pg_comp_crc32c_sse42_inline(crc, data, len);
+ }
+ else
+ return pg_comp_crc32c_sse42(crc, data, len);
+}
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
diff --git a/src/include/port/pg_crc32c_sse42_impl.h b/src/include/port/pg_crc32c_sse42_impl.h
new file mode 100644
index 00000000000..e10ad777618
--- /dev/null
+++ b/src/include/port/pg_crc32c_sse42_impl.h
@@ -0,0 +1,74 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_crc32c_sse42_impl.h
+ * Inline implementation of CRC computation using SSE 4.2
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/port/pg_crc32c_sse42_impl.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_CRC32C_SSE42_IMPL_H
+#define PG_CRC32C_SSE42_IMPL_H
+
+#include "c.h"
+
+#include <nmmintrin.h>
+
+pg_attribute_no_sanitize_alignment()
+pg_attribute_target("sse4.2")
+static inline
+pg_crc32c
+pg_comp_crc32c_sse42_inline(pg_crc32c crc, const void *data, size_t len)
+{
+ const unsigned char *p = data;
+ const unsigned char *pend = p + len;
+
+ /*
+ * Process eight bytes of data at a time.
+ *
+ * NB: We do unaligned accesses here. The Intel architecture allows that,
+ * and performance testing didn't show any performance gain from aligning
+ * the begin address.
+ */
+#ifdef __x86_64__
+ while (p + 8 <= pend)
+ {
+ crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
+ p += 8;
+ }
+
+ /* Process remaining full four bytes if any */
+ if (p + 4 <= pend)
+ {
+ crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
+ p += 4;
+ }
+#else
+
+ /*
+ * Process four bytes at a time. (The eight byte instruction is not
+ * available on the 32-bit x86 architecture).
+ */
+ while (p + 4 <= pend)
+ {
+ crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
+ p += 4;
+ }
+#endif /* __x86_64__ */
+
+ /* Process any remaining bytes one at a time. */
+ while (p < pend)
+ {
+ crc = _mm_crc32_u8(crc, *p);
+ p++;
+ }
+
+ return crc;
+}
+
+#endif /* PG_CRC32C_SSE42_IMPL_H */
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df31..6a35f7fdc67 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -17,54 +17,12 @@
#include <nmmintrin.h>
#include "port/pg_crc32c.h"
+#include "port/pg_crc32c_sse42_impl.h"
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
{
- const unsigned char *p = data;
- const unsigned char *pend = p + len;
-
- /*
- * Process eight bytes of data at a time.
- *
- * NB: We do unaligned accesses here. The Intel architecture allows that,
- * and performance testing didn't show any performance gain from aligning
- * the begin address.
- */
-#ifdef __x86_64__
- while (p + 8 <= pend)
- {
- crc = (uint32) _mm_crc32_u64(crc, *((const uint64 *) p));
- p += 8;
- }
-
- /* Process remaining full four bytes if any */
- if (p + 4 <= pend)
- {
- crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
- p += 4;
- }
-#else
-
- /*
- * Process four bytes at a time. (The eight byte instruction is not
- * available on the 32-bit x86 architecture).
- */
- while (p + 4 <= pend)
- {
- crc = _mm_crc32_u32(crc, *((const unsigned int *) p));
- p += 4;
- }
-#endif /* __x86_64__ */
-
- /* Process any remaining bytes one at a time. */
- while (p < pend)
- {
- crc = _mm_crc32_u8(crc, *p);
- p++;
- }
-
- return crc;
+ return pg_comp_crc32c_sse42_inline(crc, data, len);
}
--
2.48.1
v12-0005-Use-runtime-check-even-when-we-have-SSE-4.2-at-c.patchtext/x-patch; charset=US-ASCII; name=v12-0005-Use-runtime-check-even-when-we-have-SSE-4.2-at-c.patchDownload
From e38654507d2efad4b5ad75548e0c388c3db9cfe5 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Fri, 28 Feb 2025 18:27:40 +0700
Subject: [PATCH v12 5/6] Use runtime check even when we have SSE 4.2 at
compile time
This allows us to use PCLMUL for longer inputs. Short inputs are
inlined to avoid the indirection through a function pointer.
---
configure | 2 +-
configure.ac | 2 +-
src/include/port/pg_crc32c.h | 15 +++++++++++----
src/port/meson.build | 1 +
src/port/pg_crc32c_sse42_choose.c | 2 ++
5 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/configure b/configure
index 93fddd69981..91c0ffc8272 100755
--- a/configure
+++ b/configure
@@ -17684,7 +17684,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
$as_echo "SSE 4.2" >&6; }
else
diff --git a/configure.ac b/configure.ac
index b6d02f5ecc7..a85bdbd4ff6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2151,7 +2151,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index fe0e1b6b275..26b676dddc9 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -55,22 +55,29 @@ typedef uint32 pg_crc32c;
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
static inline
pg_crc32c
pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
{
- if (__builtin_constant_p(len) && len < 64)
+ if (len < 64)
{
/*
- * For small constant inputs, inline the computation. This allows the
- * compiler to unroll loops.
+ * For small inputs, inline the computation to avoid the runtime
+ * check. This also allows the compiler to unroll loops for constant
+ * input.
*/
return pg_comp_crc32c_sse42_inline(crc, data, len);
}
else
- return pg_comp_crc32c_sse42(crc, data, len);
+ /* For larger inputs, use a runtime check for PCLMUL instructions. */
+ return pg_comp_crc32c(crc, data, len);
}
#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d43..8d70a4d510e 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -83,6 +83,7 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index abea0f90eb3..89a48c76894 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -55,8 +55,10 @@ pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
pg_comp_crc32c = pg_comp_crc32c_pclmul;
#endif
}
+#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
+#endif
return pg_comp_crc32c(crc, data, len);
}
--
2.48.1
v12-0002-Attempt-to-make-benchmark-more-sensitive-to-late.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Attempt-to-make-benchmark-more-sensitive-to-late.patchDownload
From 7a9b94677da30db0f8c296fe71037f65b157bc1c Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 5 Mar 2025 07:52:52 +0700
Subject: [PATCH v12 2/6] Attempt to make benchmark more sensitive to latency
---
contrib/test_crc32c/test_crc32c.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
index 28bc42de314..3e5ebad4e39 100644
--- a/contrib/test_crc32c/test_crc32c.c
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -21,13 +21,13 @@ drive_crc32c(PG_FUNCTION_ARGS)
{
int64 count = PG_GETARG_INT32(0);
int64 num = PG_GETARG_INT32(1);
- char* data = malloc((size_t)num);
- pg_crc32c crc;
+ char* data = malloc((size_t)num + 256);
+ pg_crc32c crc = 0;
pg_prng_state state;
uint64 seed = 42;
pg_prng_seed(&state, seed);
/* set random data */
- for (uint64 i = 0; i < num; i++)
+ for (uint64 i = 0; i < num + 256; i++)
{
data[i] = pg_prng_uint32(&state) % 255;
}
@@ -36,11 +36,15 @@ drive_crc32c(PG_FUNCTION_ARGS)
while(count--)
{
- INIT_CRC32C(crc);
- COMP_CRC32C(crc, data, num);
- FIN_CRC32C(crc);
+ size_t delta = crc & 7;
+
+ /* make both pointer and length unpredictable */
+ COMP_CRC32C(crc, data + delta, num + delta);
+ /* simulate WAL header */
+ COMP_CRC32C(crc, data + delta, 20);
}
+ FIN_CRC32C(crc);
free((void *)data);
PG_RETURN_INT64((int64_t)crc);
--
2.48.1
v12-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchDownload
From 2d8b2ad3e967231d1498953f6563b36b94977445 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v12 1/6] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67a..06673db0625 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 00000000000..5b747c6184a
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 00000000000..dff6bb3133b
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 00000000000..d7bec4ba1cb
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 00000000000..95c6dfe4488
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 00000000000..52b9772f908
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 00000000000..28bc42de314
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 00000000000..878a077ee18
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
Overall, I wish we could avoid splitting things into separate files and
adding more header file gymnastics, but maybe there isn't much better we
can do without overhauling the CPU feature detection code.
I wanted to make an attempt to make this aspect nicer. v13-0002
incorporates deliberately compact and simple loops for inlined
constant input into the dispatch function, and leaves the existing
code alone. This avoids code churn and saves vertical space in the
copied code. It needs a bit more commentary, but I hope this is a more
digestible prerequisite to the CLMUL algorithm -- as a reminder, it'll
be simpler if we can always assume non-constant input can go through a
function pointer.
I've re-attached the modified perf test from v12 just in case anyone
wants to play with it (v13-0003), but named so that the CF bot can't
find it, since it breaks the tests in the original perf test (It's not
for commit anyway).
Adding back AVX-512 should be fairly mechanical, since Raghuveer and
Nathan have already done the work needed for that.
--
John Naylor
Amazon Web Services
Attachments:
v13-0003-Attempt-to-make-benchmark-more-sensitive-to-late.patch.nocfbotapplication/octet-stream; name=v13-0003-Attempt-to-make-benchmark-more-sensitive-to-late.patch.nocfbotDownload
From 65814a642b82a1f313eae2e3aada785af1f65be4 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 5 Mar 2025 07:52:52 +0700
Subject: [PATCH v13 3/3] Attempt to make benchmark more sensitive to latency
---
contrib/test_crc32c/test_crc32c.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
index 28bc42de314..3e5ebad4e39 100644
--- a/contrib/test_crc32c/test_crc32c.c
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -21,13 +21,13 @@ drive_crc32c(PG_FUNCTION_ARGS)
{
int64 count = PG_GETARG_INT32(0);
int64 num = PG_GETARG_INT32(1);
- char* data = malloc((size_t)num);
- pg_crc32c crc;
+ char* data = malloc((size_t)num + 256);
+ pg_crc32c crc = 0;
pg_prng_state state;
uint64 seed = 42;
pg_prng_seed(&state, seed);
/* set random data */
- for (uint64 i = 0; i < num; i++)
+ for (uint64 i = 0; i < num + 256; i++)
{
data[i] = pg_prng_uint32(&state) % 255;
}
@@ -36,11 +36,15 @@ drive_crc32c(PG_FUNCTION_ARGS)
while(count--)
{
- INIT_CRC32C(crc);
- COMP_CRC32C(crc, data, num);
- FIN_CRC32C(crc);
+ size_t delta = crc & 7;
+
+ /* make both pointer and length unpredictable */
+ COMP_CRC32C(crc, data + delta, num + delta);
+ /* simulate WAL header */
+ COMP_CRC32C(crc, data + delta, 20);
}
+ FIN_CRC32C(crc);
free((void *)data);
PG_RETURN_INT64((int64_t)crc);
--
2.48.1
v13-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchtext/x-patch; charset=US-ASCII; name=v13-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchDownload
From 298cbb26558807fe9ce87f3709d6bfb785668d5d Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v13 1/3] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67a..06673db0625 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 00000000000..5b747c6184a
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 00000000000..dff6bb3133b
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 00000000000..d7bec4ba1cb
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 00000000000..95c6dfe4488
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 00000000000..52b9772f908
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 00000000000..28bc42de314
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 00000000000..878a077ee18
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
v13-0002-Inline-CRC-computation-for-fixed-length-input.patchtext/x-patch; charset=US-ASCII; name=v13-0002-Inline-CRC-computation-for-fixed-length-input.patchDownload
From e5f721d017e4431a32e2605639afea64bf11f52e Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Fri, 28 Feb 2025 16:27:30 +0700
Subject: [PATCH v13 2/3] Inline CRC computation for fixed-length input
Use a simplified copy of the loop in pg_crc32c_sse42.c to avoid
moving code to a separate header.
---
src/include/port/pg_crc32c.h | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b1..b9f0a8c7cca 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -43,12 +43,43 @@ typedef uint32 pg_crc32c;
#if defined(USE_SSE42_CRC32C)
/* Use Intel SSE4.2 instructions. */
+
+#include <nmmintrin.h>
+
#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+pg_attribute_no_sanitize_alignment()
+static inline
+pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ if (__builtin_constant_p(len))
+ {
+ const unsigned char *p = data;
+
+ /*
+ * For constant inputs, inline the computation to avoid the
+ * indirect function call. This also allows the compiler to unroll
+ * loops for small inputs.
+ */
+#if SIZEOF_VOID_P >= 8
+ for (; len >= 8; p += 8, len -= 8)
+ crc = _mm_crc32_u64(crc, *(const uint64 *) p);
+#endif
+ for (; len >= 4; p += 4, len -= 4)
+ crc = _mm_crc32_u32(crc, *(const uint32 *) p);
+ for (; len > 0; --len)
+ crc = _mm_crc32_u8(crc, *p++);
+ return crc;
+ }
+ else
+ return pg_comp_crc32c_sse42(crc, data, len);
+}
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
--
2.48.1
On Mon, Mar 10, 2025 at 03:48:31PM +0700, John Naylor wrote:
On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
Overall, I wish we could avoid splitting things into separate files and
adding more header file gymnastics, but maybe there isn't much better we
can do without overhauling the CPU feature detection code.I wanted to make an attempt to make this aspect nicer. v13-0002
incorporates deliberately compact and simple loops for inlined
constant input into the dispatch function, and leaves the existing
code alone. This avoids code churn and saves vertical space in the
copied code. It needs a bit more commentary, but I hope this is a more
digestible prerequisite to the CLMUL algorithm -- as a reminder, it'll
be simpler if we can always assume non-constant input can go through a
function pointer.
That is certainly more readable. FWIW I think it would be entirely
reasonable to replace the pg_crc32c_sse42.c implementation with a call to
this new pg_comp_crc32c_dispatch() function. Of course, you'd have to
split things up like:
pg_attribute_no_sanitize_alignment()
static inline
pg_crc32c
pg_comp_crc32c_sse42_inline(pgcrc32c crc, const void *data, size_t len)
{
const unsigned char *p = data;
#ifdef __x86_64__
for (; len >= 8; p += 8, len -= 8)
crc = _mm_crc32_u64(crc, *(const uint64 *) p);
#endif
for (; len >= 4; p += 4, len -= 4)
crc = _mm_crc32_u32(crc, *(const uint32 *) p);
for (; len > 0; len--)
crc = _mm_crc32_u8(crc, *p++)
return crc;
}
pg_attribute_no_sanitize_alignment()
static inline
pg_crc32c
pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
{
if (__builtin_constant_p(len))
return pg_comp_crc32c_sse42_inline(crc, data, len);
return pg_comp_crc32c_sse42(crc, data, len);
}
And then in pg_crc32c_sse42.c:
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
{
return pg_comp_crc32c_sse42_inline(crc, data, len);
}
Granted, that adds back some of the complexity you were trying to avoid
(and is very similar to your v12 patches), but it's pretty compact and
avoids some code duplication. FTR I don't feel too strongly about this,
but on balance I guess I'd be okay with a tad more complexity than your
v13 patches if it served a useful purpose.
--
nathan
On Tue, Mar 11, 2025 at 4:47 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Mar 10, 2025 at 03:48:31PM +0700, John Naylor wrote:
On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
Overall, I wish we could avoid splitting things into separate files and
adding more header file gymnastics, but maybe there isn't much better we
can do without overhauling the CPU feature detection code.I wanted to make an attempt to make this aspect nicer. v13-0002
incorporates deliberately compact and simple loops for inlined
constant input into the dispatch function, and leaves the existing
code alone. This avoids code churn and saves vertical space in the
copied code. It needs a bit more commentary, but I hope this is a more
digestible prerequisite to the CLMUL algorithm -- as a reminder, it'll
be simpler if we can always assume non-constant input can go through a
function pointer.That is certainly more readable. FWIW I think it would be entirely
reasonable to replace the pg_crc32c_sse42.c implementation with a call to
this new pg_comp_crc32c_dispatch() function. Of course, you'd have to
split things up like:
[snip]
That could work as well. I'm thinking if we do PMULL on Arm, it might
be advantageous to keep the inline path and function paths with
distinct coding -- because of the pickier alignment on that platform,
it might not be worth pre-aligning the pointer to 8 bytes for a
20-byte constant input.
I've gone ahead and added the generated AVX-512 algorithm in v14-0005,
and added the build support and some of the runtime support from Paul
and Raghuveer's earlier patches in 0006-7. It passes CI, but I'll have
to arrange access to other hardware to verify the runtime behavior. I
think the Meson support is most of the way there, but it looks like
configure.ac got whacked around cosmetically quite a bit. If we feel
it's time to refactor things there, we'll want to split that out. In
any case, for autoconf I've pretty much kept the earlier work for now.
--
John Naylor
Amazon Web Services
Attachments:
v14-0004-Always-do-runtime-check-for-x86-to-simplify-PCLM.patchtext/x-patch; charset=US-ASCII; name=v14-0004-Always-do-runtime-check-for-x86-to-simplify-PCLM.patchDownload
From adfe02a6d169be865937b567bc1b2b2ffde60631 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 11 Mar 2025 11:20:20 +0700
Subject: [PATCH v14 4/8] Always do runtime check for x86 to simplify PCLMUL
---
configure | 2 +-
configure.ac | 2 +-
src/include/port/pg_crc32c.h | 20 ++++++++++++++------
src/port/meson.build | 1 +
src/port/pg_crc32c_sse42.c | 2 +-
src/port/pg_crc32c_sse42_choose.c | 2 ++
6 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/configure b/configure
index 93fddd69981..91c0ffc8272 100755
--- a/configure
+++ b/configure
@@ -17684,7 +17684,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
$as_echo "SSE 4.2" >&6; }
else
diff --git a/configure.ac b/configure.ac
index b6d02f5ecc7..a85bdbd4ff6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2151,7 +2151,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 229f4f6a65a..28253b48018 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -47,7 +47,10 @@ typedef uint32 pg_crc32c;
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
#if defined(USE_SSE42_CRC32C)
-/* Use Intel SSE4.2 instructions. */
+/*
+ * Use either Intel SSE 4.2 or PCLMUL instructions. We don't need a runtime check
+ * for SSE 4.2, so we can inline those in some cases.
+ */
#include <nmmintrin.h>
@@ -55,7 +58,11 @@ typedef uint32 pg_crc32c;
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
pg_attribute_no_sanitize_alignment()
static inline
@@ -67,9 +74,9 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
const unsigned char *p = data;
/*
- * For constant inputs, inline the computation to avoid the
- * indirect function call. This also allows the compiler to unroll
- * loops for small inputs.
+ * For constant inputs, inline the computation to avoid the indirect
+ * function call. This also allows the compiler to unroll loops for
+ * small inputs.
*/
#if SIZEOF_VOID_P >= 8
for (; len >= 8; p += 8, len -= 8)
@@ -82,7 +89,8 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
return crc;
}
else
- return pg_comp_crc32c_sse42(crc, data, len);
+ /* Otherwise, use a runtime check for PCLMUL instructions. */
+ return pg_comp_crc32c(crc, data, len);
}
#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
@@ -123,7 +131,7 @@ extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_
#elif defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
+ * Use ARMv8 instructions, but perform a runtime check first
* to check that they are available.
*/
#define COMP_CRC32C(crc, data, len) \
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d43..8d70a4d510e 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -83,6 +83,7 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 2001e69850b..c57d6c6293b 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -152,7 +152,7 @@ pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
len = end - buf;
}
- return pg_comp_crc32c_sse42_inline(crc0, buf, len);
+ return pg_comp_crc32c_sse42(crc0, buf, len);
}
#endif
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index abea0f90eb3..89a48c76894 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -55,8 +55,10 @@ pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
pg_comp_crc32c = pg_comp_crc32c_pclmul;
#endif
}
+#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
+#endif
return pg_comp_crc32c(crc, data, len);
}
--
2.48.1
v14-0008-Temp-fixup-build-of-benchmark-on-Windows.patchtext/x-patch; charset=US-ASCII; name=v14-0008-Temp-fixup-build-of-benchmark-on-Windows.patchDownload
From 61fe19c116f2593757eedd391ca5e9e80c543aee Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 11 Mar 2025 16:58:10 +0700
Subject: [PATCH v14 8/8] Temp fixup: build of benchmark on Windows
---
src/include/port/pg_crc32c.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index a45f56a9405..ee3245c2042 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -99,7 +99,7 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
-extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
+extern PGDLLIMPORT pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
--
2.48.1
v14-0007-AVX-512-CRC-autoconf.patchtext/x-patch; charset=US-ASCII; name=v14-0007-AVX-512-CRC-autoconf.patchDownload
From 5c07fe6c3ecacf2cbcc2c3e081a1bfb2a2fc259b Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 11 Mar 2025 14:57:01 +0700
Subject: [PATCH v14 7/8] AVX-512 CRC / autoconf
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: Paul Amonson <paul.d.amonson@intel.com>
---
config/c-compiler.m4 | 30 +++++++++
configure | 151 ++++++++++++++++++++++++++-----------------
configure.ac | 104 +++++++++++++----------------
3 files changed, 164 insertions(+), 121 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 8534cc54c13..f172f260e4e 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -577,6 +577,36 @@ fi
undefine([Ac_cachevar])dnl
])# PGAC_SSE42_CRC32_INTRINSICS
+# PGAC_AVX512_CRC32_INTRINSICS
+# ---------------------------
+# Check if the compiler supports the x86 CRC instructions added in AVX-512,
+# using intrinsics with function __attribute__((target("..."))):
+
+AC_DEFUN([PGAC_AVX512_CRC32_INTRINSICS],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_crc32_intrinsics])])dnl
+AC_CACHE_CHECK([for _mm512_clmulepi64_epi128 with function attribute], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx512vl,vpclmulqdq")))
+ #endif
+ static int crc32_avx512_test(void)
+ {
+ __m512i x0 = _mm512_set1_epi32(0x1);
+ __m512i x1 = _mm512_set1_epi32(0x2);
+ __m512i x2 = _mm512_clmulepi64_epi128(x1, x0, 0x00); // vpclmulqdq
+ __m128i a1 = _mm_xor_epi64(_mm512_castsi512_si128(x1), _mm512_castsi512_si128(x0)); //avx512vl
+ int64_t val = _mm_crc32_u64(0, _mm_extract_epi64(a1, 0)); // 64-bit instruction
+ return (uint32_t)_mm_crc32_u64(val, _mm_extract_epi64(a1, 1));
+ }],
+ [return crc32_avx512_test();])],
+ [Ac_cachevar=yes],
+ [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+ pgac_avx512_crc32_intrinsics=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX512_CRC32_INTRINSICS
# PGAC_ARMV8_CRC32C_INTRINSICS
# ----------------------------
diff --git a/configure b/configure
index 91c0ffc8272..a7c3d56f9f2 100755
--- a/configure
+++ b/configure
@@ -17381,7 +17381,7 @@ $as_echo "#define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1" >>confdefs.h
fi
fi
-# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
+# Check for Intel SSE 4.2 and AVX-512 intrinsics to do CRC calculations.
#
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for _mm_crc32_u8 and _mm_crc32_u32" >&5
$as_echo_n "checking for _mm_crc32_u8 and _mm_crc32_u32... " >&6; }
@@ -17425,6 +17425,52 @@ if test x"$pgac_cv_sse42_crc32_intrinsics" = x"yes"; then
fi
+# Check if the _mm512_clmulepi64_epi128 and _mm_xor_epi64 can be used with with
+# the __attribute__((target("avx512vl,vpclmulqdq"))).
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for _mm512_clmulepi64_epi128 with function attribute" >&5
+$as_echo_n "checking for _mm512_clmulepi64_epi128 with function attribute... " >&6; }
+if ${pgac_cv_avx512_crc32_intrinsics+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx512vl,vpclmulqdq")))
+ #endif
+ static int crc32_avx512_test(void)
+ {
+ __m512i x0 = _mm512_set1_epi32(0x1);
+ __m512i x1 = _mm512_set1_epi32(0x2);
+ __m512i x2 = _mm512_clmulepi64_epi128(x1, x0, 0x00); // vpclmulqdq
+ __m128i a1 = _mm_xor_epi64(_mm512_castsi512_si128(x1), _mm512_castsi512_si128(x0)); //avx512vl
+ int64_t val = _mm_crc32_u64(0, _mm_extract_epi64(a1, 0)); // 64-bit instruction
+ return (uint32_t)_mm_crc32_u64(val, _mm_extract_epi64(a1, 1));
+ }
+int
+main ()
+{
+return crc32_avx512_test();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ pgac_cv_avx512_crc32_intrinsics=yes
+else
+ pgac_cv_avx512_crc32_intrinsics=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx512_crc32_intrinsics" >&5
+$as_echo "$pgac_cv_avx512_crc32_intrinsics" >&6; }
+if test x"$pgac_cv_avx512_crc32_intrinsics" = x"yes"; then
+ pgac_avx512_crc32_intrinsics=yes
+fi
+
+
# Are we targeting a processor that supports SSE 4.2? gcc, clang and icc all
# define __SSE4_2__ in that case.
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
@@ -17626,9 +17672,8 @@ fi
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
# use the special CRC instructions for calculating CRC-32C. If we're not
# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
+# the SSE/AVX-512 intrinsics compile both implementations and select which one
+# to use at runtime, depending runtime cpuid information.
#
# Similarly, if we are targeting an ARM processor that has the CRC
# instructions that are part of the ARMv8 CRC Extension, use them. And if
@@ -17645,88 +17690,72 @@ fi
#
# If we are targeting a LoongArch processor, CRC instructions are
# always available (at least on 64 bit), so no runtime check is needed.
-if test x"$USE_SLICING_BY_8_CRC32C" = x"" && test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_ARMV8_CRC32C" = x"" && test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_LOONGARCH_CRC32C" = x""; then
- # Use Intel SSE 4.2 if available.
- if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
- USE_SSE42_CRC32C=1
- else
- # Intel SSE 4.2, with runtime check? The CPUID instruction is needed for
- # the runtime check.
- if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
- USE_SSE42_CRC32C_WITH_RUNTIME_CHECK=1
- else
- # Use ARM CRC Extension if available.
- if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$CFLAGS_CRC" = x""; then
- USE_ARMV8_CRC32C=1
- else
- # ARM CRC Extension, with runtime check?
- if test x"$pgac_armv8_crc32c_intrinsics" = x"yes"; then
- USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK=1
- else
- # LoongArch CRCC instructions.
- if test x"$pgac_loongarch_crc32c_intrinsics" = x"yes"; then
- USE_LOONGARCH_CRC32C=1
- else
- # fall back to slicing-by-8 algorithm, which doesn't require any
- # special CPU support.
- USE_SLICING_BY_8_CRC32C=1
- fi
- fi
- fi
- fi
- fi
-fi
-# Set PG_CRC32C_OBJS appropriately depending on the selected implementation.
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking which CRC-32C implementation to use" >&5
$as_echo_n "checking which CRC-32C implementation to use... " >&6; }
-if test x"$USE_SSE42_CRC32C" = x"1"; then
+if test x"$host_cpu" = x"x86_64"; then
+ #x86 only:
+ PG_CRC32C_OBJS="pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
- { $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
-$as_echo "SSE 4.2" >&6; }
-else
- if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
+ PG_CRC32C_OBJS+=" pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: CRC32C baseline feature SSE 4.2" >&5
+$as_echo "CRC32C baseline feature SSE 4.2" >&6; }
+ else
+ if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
$as_echo "#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
- { $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime check" >&5
-$as_echo "SSE 4.2 with runtime check" >&6; }
- else
- if test x"$USE_ARMV8_CRC32C" = x"1"; then
+ PG_CRC32C_OBJS+=" pg_crc32c_sse42.o"
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: CRC32C SSE42 with runtime check" >&5
+$as_echo "CRC32C SSE42 with runtime check" >&6; }
+ fi
+ fi
+ if test x"$pgac_avx512_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
+
+$as_echo "#define USE_AVX512_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: CRC32C AVX-512 with runtime check" >&5
+$as_echo "CRC32C AVX-512 with runtime check" >&6; }
+ fi
+else
+ # non x86 code:
+ # Use ARM CRC Extension if available.
+ if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$CFLAGS_CRC" = x""; then
$as_echo "#define USE_ARMV8_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_armv8.o"
- { $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions" >&5
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o"
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions" >&5
$as_echo "ARMv8 CRC instructions" >&6; }
- else
- if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
+ else
+ # ARM CRC Extension, with runtime check?
+ if test x"$pgac_armv8_crc32c_intrinsics" = x"yes"; then
$as_echo "#define USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
- { $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions with runtime check" >&5
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: ARMv8 CRC instructions with runtime check" >&5
$as_echo "ARMv8 CRC instructions with runtime check" >&6; }
- else
- if test x"$USE_LOONGARCH_CRC32C" = x"1"; then
+ else
+ if test x"$pgac_loongarch_crc32c_intrinsics" = x"yes"; then
$as_echo "#define USE_LOONGARCH_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_loongarch.o"
- { $as_echo "$as_me:${as_lineno-$LINENO}: result: LoongArch CRCC instructions" >&5
+ PG_CRC32C_OBJS="pg_crc32c_loongarch.o"
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: LoongArch CRCC instructions" >&5
$as_echo "LoongArch CRCC instructions" >&6; }
- else
+ else
+ # fall back to slicing-by-8 algorithm, which doesn't require any
+ # special CPU support.
$as_echo "#define USE_SLICING_BY_8_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sb8.o"
- { $as_echo "$as_me:${as_lineno-$LINENO}: result: slicing-by-8" >&5
+ PG_CRC32C_OBJS="pg_crc32c_sb8.o"
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: slicing-by-8" >&5
$as_echo "slicing-by-8" >&6; }
- fi
fi
fi
fi
diff --git a/configure.ac b/configure.ac
index a85bdbd4ff6..ee8b225ed87 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2057,10 +2057,14 @@ if test x"$host_cpu" = x"x86_64"; then
fi
fi
-# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
+# Check for Intel SSE 4.2 and AVX-512 intrinsics to do CRC calculations.
#
PGAC_SSE42_CRC32_INTRINSICS()
+# Check if the _mm512_clmulepi64_epi128 and _mm_xor_epi64 can be used with with
+# the __attribute__((target("avx512vl,vpclmulqdq"))).
+PGAC_AVX512_CRC32_INTRINSICS([])
+
# Are we targeting a processor that supports SSE 4.2? gcc, clang and icc all
# define __SSE4_2__ in that case.
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [
@@ -2096,9 +2100,8 @@ AC_SUBST(CFLAGS_CRC)
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
# use the special CRC instructions for calculating CRC-32C. If we're not
# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
+# the SSE/AVX-512 intrinsics compile both implementations and select which one
+# to use at runtime, depending runtime cpuid information.
#
# Similarly, if we are targeting an ARM processor that has the CRC
# instructions that are part of the ARMv8 CRC Extension, use them. And if
@@ -2115,69 +2118,50 @@ AC_SUBST(CFLAGS_CRC)
#
# If we are targeting a LoongArch processor, CRC instructions are
# always available (at least on 64 bit), so no runtime check is needed.
-if test x"$USE_SLICING_BY_8_CRC32C" = x"" && test x"$USE_SSE42_CRC32C" = x"" && test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_ARMV8_CRC32C" = x"" && test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"" && test x"$USE_LOONGARCH_CRC32C" = x""; then
- # Use Intel SSE 4.2 if available.
- if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
- USE_SSE42_CRC32C=1
- else
- # Intel SSE 4.2, with runtime check? The CPUID instruction is needed for
- # the runtime check.
- if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
- USE_SSE42_CRC32C_WITH_RUNTIME_CHECK=1
+
+AC_MSG_CHECKING([which CRC-32C implementation to use])
+if test x"$host_cpu" = x"x86_64"; then
+ #x86 only:
+ PG_CRC32C_OBJS="pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
+ if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
+ AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
+ PG_CRC32C_OBJS+=" pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
+ AC_MSG_RESULT(CRC32C baseline feature SSE 4.2)
else
- # Use ARM CRC Extension if available.
- if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$CFLAGS_CRC" = x""; then
- USE_ARMV8_CRC32C=1
- else
- # ARM CRC Extension, with runtime check?
- if test x"$pgac_armv8_crc32c_intrinsics" = x"yes"; then
- USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK=1
- else
- # LoongArch CRCC instructions.
- if test x"$pgac_loongarch_crc32c_intrinsics" = x"yes"; then
- USE_LOONGARCH_CRC32C=1
- else
- # fall back to slicing-by-8 algorithm, which doesn't require any
- # special CPU support.
- USE_SLICING_BY_8_CRC32C=1
- fi
+ if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
+ AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
+ PG_CRC32C_OBJS+=" pg_crc32c_sse42.o"
+ AC_MSG_RESULT(CRC32C SSE42 with runtime check)
fi
- fi
fi
- fi
-fi
-
-# Set PG_CRC32C_OBJS appropriately depending on the selected implementation.
-AC_MSG_CHECKING([which CRC-32C implementation to use])
-if test x"$USE_SSE42_CRC32C" = x"1"; then
- AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
- AC_MSG_RESULT(SSE 4.2)
+ if test x"$pgac_avx512_crc32_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" = x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
+ AC_DEFINE(USE_AVX512_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel AVX 512 CRC instructions with a runtime check.])
+ AC_MSG_RESULT(CRC32C AVX-512 with runtime check)
+ fi
else
- if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
- AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
- AC_MSG_RESULT(SSE 4.2 with runtime check)
+ # non x86 code:
+ # Use ARM CRC Extension if available.
+ if test x"$pgac_armv8_crc32c_intrinsics" = x"yes" && test x"$CFLAGS_CRC" = x""; then
+ AC_DEFINE(USE_ARMV8_CRC32C, 1, [Define to 1 to use ARMv8 CRC Extension.])
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o"
+ AC_MSG_RESULT(ARMv8 CRC instructions)
else
- if test x"$USE_ARMV8_CRC32C" = x"1"; then
- AC_DEFINE(USE_ARMV8_CRC32C, 1, [Define to 1 to use ARMv8 CRC Extension.])
- PG_CRC32C_OBJS="pg_crc32c_armv8.o"
- AC_MSG_RESULT(ARMv8 CRC instructions)
+ # ARM CRC Extension, with runtime check?
+ if test x"$pgac_armv8_crc32c_intrinsics" = x"yes"; then
+ AC_DEFINE(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use ARMv8 CRC Extension with a runtime check.])
+ PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
+ AC_MSG_RESULT(ARMv8 CRC instructions with runtime check)
else
- if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
- AC_DEFINE(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use ARMv8 CRC Extension with a runtime check.])
- PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_sb8.o pg_crc32c_armv8_choose.o"
- AC_MSG_RESULT(ARMv8 CRC instructions with runtime check)
+ if test x"$pgac_loongarch_crc32c_intrinsics" = x"yes"; then
+ AC_DEFINE(USE_LOONGARCH_CRC32C, 1, [Define to 1 to use LoongArch CRCC instructions.])
+ PG_CRC32C_OBJS="pg_crc32c_loongarch.o"
+ AC_MSG_RESULT(LoongArch CRCC instructions)
else
- if test x"$USE_LOONGARCH_CRC32C" = x"1"; then
- AC_DEFINE(USE_LOONGARCH_CRC32C, 1, [Define to 1 to use LoongArch CRCC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_loongarch.o"
- AC_MSG_RESULT(LoongArch CRCC instructions)
- else
- AC_DEFINE(USE_SLICING_BY_8_CRC32C, 1, [Define to 1 to use software CRC-32C implementation (slicing-by-8).])
- PG_CRC32C_OBJS="pg_crc32c_sb8.o"
- AC_MSG_RESULT(slicing-by-8)
- fi
+ # fall back to slicing-by-8 algorithm, which doesn't require any
+ # special CPU support.
+ AC_DEFINE(USE_SLICING_BY_8_CRC32C, 1, [Define to 1 to use software CRC-32C implementation (slicing-by-8).])
+ PG_CRC32C_OBJS="pg_crc32c_sb8.o"
+ AC_MSG_RESULT(slicing-by-8)
fi
fi
fi
--
2.48.1
v14-0005-Add-runtime-support-for-AVX-512-CRC.patchtext/x-patch; charset=US-ASCII; name=v14-0005-Add-runtime-support-for-AVX-512-CRC.patchDownload
From 85970737d58d3fb46e954f4b056a8411c0870882 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 11 Mar 2025 13:07:01 +0700
Subject: [PATCH v14 5/8] Add runtime support for AVX-512 CRC
---
src/port/pg_crc32c_sse42_choose.c | 59 ++++++++++++++++++++++++++-----
1 file changed, 50 insertions(+), 9 deletions(-)
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 89a48c76894..c2d25253c2c 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -20,16 +20,37 @@
#include "c.h"
-#ifdef HAVE__GET_CPUID
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
#include <cpuid.h>
#endif
-#ifdef HAVE__CPUID
+#include <immintrin.h>
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
#include <intrin.h>
#endif
#include "port/pg_crc32c.h"
+/*
+ * Does XGETBV say the ZMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that osxsave is available
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+zmm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0xe6) == 0xe6;
+#else
+ return false;
+#endif
+}
+
/*
* This gets called on the first call. It replaces the function pointer
* so that subsequent calls are routed directly to the chosen implementation.
@@ -39,6 +60,15 @@ pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
unsigned int exx[4] = {0, 0, 0, 0};
+#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
+
+ /*
+ * Set fallback. We must guard since slicing-by-8 is not visible
+ * everywhere.
+ */
+ pg_comp_crc32c = pg_comp_crc32c_sb8;
+#endif
+
#if defined(HAVE__GET_CPUID)
__get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
#elif defined(HAVE__CPUID)
@@ -50,15 +80,26 @@ pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
{
pg_comp_crc32c = pg_comp_crc32c_sse42;
-#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
- if ((exx[2] & (1 << 1)) != 0) /* PCLMUL */
- pg_comp_crc32c = pg_comp_crc32c_pclmul;
+
+ if (exx[2] & (1 << 27) && /* OSXSAVE */
+ zmm_regs_available())
+ {
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+ /* second cpuid call on leaf 7 to check extended avx512 support */
+
+ memset(exx, 0, 4 * sizeof(exx[0]));
+
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, 7, 0);
#endif
- }
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
+ if (exx[2] & (1 << 10) && /* VPCLMULQDQ */
+ exx[1] & (1 << 31)) /* AVX512-VL */
+ pg_comp_crc32c = pg_comp_crc32c_pclmul;
#endif
+ }
+ }
return pg_comp_crc32c(crc, data, len);
}
--
2.48.1
v14-0006-AVX-512-CRC-Meson.patchtext/x-patch; charset=US-ASCII; name=v14-0006-AVX-512-CRC-Meson.patchDownload
From 6ae6af741b866ed95d367d810cdd4eef64a6ac91 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 11 Mar 2025 14:16:13 +0700
Subject: [PATCH v14 6/8] AVX-512 CRC / Meson
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: Paul Amonson <paul.d.amonson@intel.com>
---
meson.build | 23 ++++++++++
src/include/port/pg_crc32c.h | 9 +---
src/port/pg_crc32c_sse42.c | 83 +++++++++++++++---------------------
3 files changed, 60 insertions(+), 55 deletions(-)
diff --git a/meson.build b/meson.build
index 13c13748e5d..f2f1164a25e 100644
--- a/meson.build
+++ b/meson.build
@@ -2352,6 +2352,29 @@ int main(void)
have_optimized_crc = true
endif
+ avx512_crc_prog = '''
+#include <immintrin.h>
+#include <stdint.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx512vl,vpclmulqdq")))
+#endif
+int main(void)
+{
+ __m512i x0 = _mm512_set1_epi32(0x1);
+ __m512i x1 = _mm512_set1_epi32(0x2);
+ __m512i x2 = _mm512_clmulepi64_epi128(x1, x0, 0x00); // vpclmulqdq
+ __m128i a1 = _mm_xor_epi64(_mm512_castsi512_si128(x1), _mm512_castsi512_si128(x0)); //avx512vl
+ int64_t val = _mm_crc32_u64(0, _mm_extract_epi64(a1, 0)); // 64-bit instruction
+ return (uint32_t)_mm_crc32_u64(val, _mm_extract_epi64(a1, 1));
+}
+'''
+
+ if cc.links(avx512_crc_prog,
+ name: 'AVX-512 CRC32C',
+ args: test_c_args)
+ cdata.set('USE_AVX512_CRC32C_WITH_RUNTIME_CHECK', 1)
+ endif
+
endif
elif host_cpu == 'arm' or host_cpu == 'aarch64'
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 28253b48018..a45f56a9405 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -37,11 +37,6 @@
typedef uint32 pg_crc32c;
-/* WIP: configure checks */
-#ifdef __x86_64__
-#define USE_PCLMUL_WITH_RUNTIME_CHECK
-#endif
-
/* The INIT and EQ macros are the same for all implementations. */
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
@@ -60,7 +55,7 @@ typedef uint32 pg_crc32c;
extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
#endif
@@ -106,7 +101,7 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
#endif
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index c57d6c6293b..f392eb5b236 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,7 +15,7 @@
#include "c.h"
#include <nmmintrin.h>
-#include <wmmintrin.h>
+#include <immintrin.h>
#include "port/pg_crc32c.h"
@@ -70,16 +70,16 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
-#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
/* Generated by https://github.com/corsix/fast-crc32/ using: */
-/* ./generate -i sse -p crc32c -a v4e */
+/* ./generate -i avx512_vpclmulqdq -p crc32c -a v1e */
/* MIT licensed */
-#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
-#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+#define clmul_lo(a, b) (_mm512_clmulepi64_epi128((a), (b), 0))
+#define clmul_hi(a, b) (_mm512_clmulepi64_epi128((a), (b), 17))
-pg_attribute_target("sse4.2,pclmul")
+pg_attribute_target("avx512vl,vpclmulqdq")
pg_crc32c
pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
{
@@ -88,67 +88,54 @@ pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
size_t len = length;
const char *buf = data;
- // This prolog is trying to avoid loads straddling
- // cache lines, but it doesn't seem worth it if
- // we're trying to be fast on small inputs as well
-#if 0
- for (; len && ((uintptr_t) buf & 7); --len)
+ /* Align on cacheline boundary. WIP: The threshold needs testing. */
+ if (unlikely(len > 256))
{
- crc0 = _mm_crc32_u8(crc0, *buf++);
- }
- if (((uintptr_t) buf & 8) && len >= 8)
- {
- crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
- buf += 8;
- len -= 8;
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ while (((uintptr_t) buf & 56) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
}
-#endif
+
if (len >= 64)
{
const char *end = buf + len;
const char *limit = buf + len - 64;
+ __m128i z0;
/* First vector chunk. */
- __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ __m512i x0 = _mm512_loadu_si512((const void *) buf),
y0;
- __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
- y1;
- __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
- y2;
- __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
- y3;
- __m128i k;
-
- k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
- x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ __m512i k;
+
+ k = _mm512_broadcast_i32x4(_mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0));
+ x0 = _mm512_xor_si512(_mm512_castsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
buf += 64;
+
/* Main loop. */
while (buf <= limit)
{
y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
- y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
- y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
- y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
- y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
- y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
- y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ x0 = _mm512_ternarylogic_epi64(x0, y0, _mm512_loadu_si512((const void *) buf), 0x96);
buf += 64;
}
- /* Reduce x0 ... x3 to just x0. */
- k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
- y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
- y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
- k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
- y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
- y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+ /* Reduce 512 bits to 128 bits. */
+ k = _mm512_setr_epi32(0x1c291d04, 0, 0xddc0152b, 0, 0x3da6d0cb, 0, 0xba4fc28e, 0, 0xf20c0dfe, 0, 0x493c7d27, 0, 0, 0, 0, 0);
+ y0 = clmul_lo(x0, k), k = clmul_hi(x0, k);
+ y0 = _mm512_xor_si512(y0, k);
+ z0 = _mm_ternarylogic_epi64(_mm512_castsi512_si128(y0), _mm512_extracti32x4_epi32(y0, 1), _mm512_extracti32x4_epi32(y0, 2), 0x96);
+ z0 = _mm_xor_si128(z0, _mm512_extracti32x4_epi32(x0, 3));
/* Reduce 128 bits to 32 bits, and multiply by x^32. */
- crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
- crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(z0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(z0, 1));
len = end - buf;
}
--
2.48.1
v14-0002-Inline-CRC-computation-for-fixed-length-input.patchtext/x-patch; charset=US-ASCII; name=v14-0002-Inline-CRC-computation-for-fixed-length-input.patchDownload
From 9c61eb35c4f104c208e22502b7e10fb2a8efdc14 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Fri, 28 Feb 2025 16:27:30 +0700
Subject: [PATCH v14 2/8] Inline CRC computation for fixed-length input
Use a simplified copy of the loop in pg_crc32c_sse42.c to avoid
moving code to a separate header.
---
src/include/port/pg_crc32c.h | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b1..b9f0a8c7cca 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -43,12 +43,43 @@ typedef uint32 pg_crc32c;
#if defined(USE_SSE42_CRC32C)
/* Use Intel SSE4.2 instructions. */
+
+#include <nmmintrin.h>
+
#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+pg_attribute_no_sanitize_alignment()
+static inline
+pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ if (__builtin_constant_p(len))
+ {
+ const unsigned char *p = data;
+
+ /*
+ * For constant inputs, inline the computation to avoid the
+ * indirect function call. This also allows the compiler to unroll
+ * loops for small inputs.
+ */
+#if SIZEOF_VOID_P >= 8
+ for (; len >= 8; p += 8, len -= 8)
+ crc = _mm_crc32_u64(crc, *(const uint64 *) p);
+#endif
+ for (; len >= 4; p += 4, len -= 4)
+ crc = _mm_crc32_u32(crc, *(const uint32 *) p);
+ for (; len > 0; --len)
+ crc = _mm_crc32_u8(crc, *p++);
+ return crc;
+ }
+ else
+ return pg_comp_crc32c_sse42(crc, data, len);
+}
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
--
2.48.1
v14-0003-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v14-0003-Improve-CRC32C-performance-on-x86_64.patchDownload
From 98e0d9cf96cc9a5eb8ce1eac9517693bb078bfe2 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Wed, 12 Feb 2025 15:27:16 +0700
Subject: [PATCH v14 3/8] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The PCLMULQDQ instruction has been widely available since 2011 (almost
as old as SSE 4.2), so this commit now requires that, as well as SSE
4.2, to build pg_crc32c_sse42.c.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
src/include/port/pg_crc32c.h | 30 ++++++++---
src/port/pg_crc32c_sse42.c | 88 +++++++++++++++++++++++++++++++
src/port/pg_crc32c_sse42_choose.c | 26 ++++-----
3 files changed, 124 insertions(+), 20 deletions(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index b9f0a8c7cca..229f4f6a65a 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -37,6 +37,11 @@
typedef uint32 pg_crc32c;
+/* WIP: configure checks */
+#ifdef __x86_64__
+#define USE_PCLMUL_WITH_RUNTIME_CHECK
+#endif
+
/* The INIT and EQ macros are the same for all implementations. */
#define INIT_CRC32C(crc) ((crc) = 0xFFFFFFFF)
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
@@ -80,6 +85,23 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
return pg_comp_crc32c_sse42(crc, data, len);
}
+#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+
+/*
+ * Use Intel SSE 4.2 or PCLMUL instructions, but perform a runtime check first
+ * to check that they are available.
+ */
+#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c((crc), (data), (len)))
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
@@ -98,7 +120,7 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+#elif defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/*
* Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
@@ -110,13 +132,7 @@ extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
#else
/*
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df31..2001e69850b 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,6 +15,7 @@
#include "c.h"
#include <nmmintrin.h>
+#include <wmmintrin.h>
#include "port/pg_crc32c.h"
@@ -68,3 +69,90 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i sse -p crc32c -a v4e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm_clmulepi64_si128((a), (b), 0))
+#define clmul_hi(a, b) (_mm_clmulepi64_si128((a), (b), 17))
+
+pg_attribute_target("sse4.2,pclmul")
+pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ // This prolog is trying to avoid loads straddling
+ // cache lines, but it doesn't seem worth it if
+ // we're trying to be fast on small inputs as well
+#if 0
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ if (((uintptr_t) buf & 8) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+#endif
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+
+ /* First vector chunk. */
+ __m128i x0 = _mm_loadu_si128((const __m128i *) buf),
+ y0;
+ __m128i x1 = _mm_loadu_si128((const __m128i *) (buf + 16)),
+ y1;
+ __m128i x2 = _mm_loadu_si128((const __m128i *) (buf + 32)),
+ y2;
+ __m128i x3 = _mm_loadu_si128((const __m128i *) (buf + 48)),
+ y3;
+ __m128i k;
+
+ k = _mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0);
+ x0 = _mm_xor_si128(_mm_cvtsi32_si128(crc0), x0);
+ buf += 64;
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y1 = clmul_lo(x1, k), x1 = clmul_hi(x1, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y3 = clmul_lo(x3, k), x3 = clmul_hi(x3, k);
+ y0 = _mm_xor_si128(y0, _mm_loadu_si128((const __m128i *) buf)), x0 = _mm_xor_si128(x0, y0);
+ y1 = _mm_xor_si128(y1, _mm_loadu_si128((const __m128i *) (buf + 16))), x1 = _mm_xor_si128(x1, y1);
+ y2 = _mm_xor_si128(y2, _mm_loadu_si128((const __m128i *) (buf + 32))), x2 = _mm_xor_si128(x2, y2);
+ y3 = _mm_xor_si128(y3, _mm_loadu_si128((const __m128i *) (buf + 48))), x3 = _mm_xor_si128(x3, y3);
+ buf += 64;
+ }
+
+ /* Reduce x0 ... x3 to just x0. */
+ k = _mm_setr_epi32(0xf20c0dfe, 0, 0x493c7d27, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y2 = clmul_lo(x2, k), x2 = clmul_hi(x2, k);
+ y0 = _mm_xor_si128(y0, x1), x0 = _mm_xor_si128(x0, y0);
+ y2 = _mm_xor_si128(y2, x3), x2 = _mm_xor_si128(x2, y2);
+ k = _mm_setr_epi32(0x3da6d0cb, 0, 0xba4fc28e, 0);
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ y0 = _mm_xor_si128(y0, x2), x0 = _mm_xor_si128(x0, y0);
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(x0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(x0, 1));
+ len = end - buf;
+ }
+
+ return pg_comp_crc32c_sse42_inline(crc0, buf, len);
+}
+
+#endif
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d4249..abea0f90eb3 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -30,8 +30,12 @@
#include "port/pg_crc32c.h"
-static bool
-pg_crc32c_sse42_available(void)
+/*
+ * This gets called on the first call. It replaces the function pointer
+ * so that subsequent calls are routed directly to the chosen implementation.
+ */
+static pg_crc32c
+pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
unsigned int exx[4] = {0, 0, 0, 0};
@@ -43,18 +47,14 @@ pg_crc32c_sse42_available(void)
#error cpuid instruction not available
#endif
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
-}
-
-/*
- * This gets called on the first call. It replaces the function pointer
- * so that subsequent calls are routed directly to the chosen implementation.
- */
-static pg_crc32c
-pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
-{
- if (pg_crc32c_sse42_available())
+ if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
+ {
pg_comp_crc32c = pg_comp_crc32c_sse42;
+#ifdef USE_PCLMUL_WITH_RUNTIME_CHECK
+ if ((exx[2] & (1 << 1)) != 0) /* PCLMUL */
+ pg_comp_crc32c = pg_comp_crc32c_pclmul;
+#endif
+ }
else
pg_comp_crc32c = pg_comp_crc32c_sb8;
--
2.48.1
v14-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchtext/x-patch; charset=US-ASCII; name=v14-0001-Add-a-Postgres-SQL-function-for-crc32c-benchmark.patchDownload
From 0b978f21bba12d62477c5fa982f99a76563cfaf4 Mon Sep 17 00:00:00 2001
From: Paul Amonson <paul.d.amonson@intel.com>
Date: Mon, 6 May 2024 08:34:17 -0700
Subject: [PATCH v14 1/8] Add a Postgres SQL function for crc32c benchmarking
Add a drive_crc32c() function to use for benchmarking crc32c
computation. The function takes 2 arguments:
(1) count: num of times CRC32C is computed in a loop.
(2) num: #bytes in the buffer to calculate crc over.
XXX not for commit
Extracted from a patch by Raghuveer Devulapalli
---
contrib/meson.build | 1 +
contrib/test_crc32c/Makefile | 20 +++++++
contrib/test_crc32c/expected/test_crc32c.out | 57 ++++++++++++++++++++
contrib/test_crc32c/meson.build | 34 ++++++++++++
contrib/test_crc32c/sql/test_crc32c.sql | 3 ++
contrib/test_crc32c/test_crc32c--1.0.sql | 1 +
contrib/test_crc32c/test_crc32c.c | 47 ++++++++++++++++
contrib/test_crc32c/test_crc32c.control | 4 ++
8 files changed, 167 insertions(+)
create mode 100644 contrib/test_crc32c/Makefile
create mode 100644 contrib/test_crc32c/expected/test_crc32c.out
create mode 100644 contrib/test_crc32c/meson.build
create mode 100644 contrib/test_crc32c/sql/test_crc32c.sql
create mode 100644 contrib/test_crc32c/test_crc32c--1.0.sql
create mode 100644 contrib/test_crc32c/test_crc32c.c
create mode 100644 contrib/test_crc32c/test_crc32c.control
diff --git a/contrib/meson.build b/contrib/meson.build
index 1ba73ebd67a..06673db0625 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('test_crc32c')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/test_crc32c/Makefile b/contrib/test_crc32c/Makefile
new file mode 100644
index 00000000000..5b747c6184a
--- /dev/null
+++ b/contrib/test_crc32c/Makefile
@@ -0,0 +1,20 @@
+MODULE_big = test_crc32c
+OBJS = test_crc32c.o
+PGFILEDESC = "test"
+EXTENSION = test_crc32c
+DATA = test_crc32c--1.0.sql
+
+first: all
+
+# test_crc32c.o: CFLAGS+=-g
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_crc32c
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/test_crc32c/expected/test_crc32c.out b/contrib/test_crc32c/expected/test_crc32c.out
new file mode 100644
index 00000000000..dff6bb3133b
--- /dev/null
+++ b/contrib/test_crc32c/expected/test_crc32c.out
@@ -0,0 +1,57 @@
+CREATE EXTENSION test_crc32c;
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
+ drive_crc32c
+--------------
+ 532139994
+ 2103623867
+ 785984197
+ 2686825890
+ 3213049059
+ 3819630168
+ 1389234603
+ 534072900
+ 2930108140
+ 2496889855
+ 1475239611
+ 136366931
+ 3067402116
+ 2012717871
+ 3682416023
+ 2054270645
+ 1817339875
+ 4100939569
+ 1192727539
+ 3636976218
+ 369764421
+ 3161609879
+ 1067984880
+ 1235066769
+ 3138425899
+ 648132037
+ 4203750233
+ 1330187888
+ 2683521348
+ 1951644495
+ 2574090107
+ 3904902018
+ 3772697795
+ 1644686344
+ 2868962106
+ 3369218491
+ 3902689890
+ 3456411865
+ 141004025
+ 1504497996
+ 3782655204
+ 3544797610
+ 3429174879
+ 2524728016
+ 3935861181
+ 25498897
+ 692684159
+ 345705535
+ 2761600287
+ 2654632420
+ 3945991399
+(51 rows)
+
diff --git a/contrib/test_crc32c/meson.build b/contrib/test_crc32c/meson.build
new file mode 100644
index 00000000000..d7bec4ba1cb
--- /dev/null
+++ b/contrib/test_crc32c/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+test_crc32c_sources = files(
+ 'test_crc32c.c',
+)
+
+if host_system == 'windows'
+ test_crc32c_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_crc32c',
+ '--FILEDESC', 'test_crc32c - test code for crc32c library',])
+endif
+
+test_crc32c = shared_module('test_crc32c',
+ test_crc32c_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += test_crc32c
+
+install_data(
+ 'test_crc32c.control',
+ 'test_crc32c--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'test_crc32c',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'test_crc32c',
+ ],
+ },
+}
diff --git a/contrib/test_crc32c/sql/test_crc32c.sql b/contrib/test_crc32c/sql/test_crc32c.sql
new file mode 100644
index 00000000000..95c6dfe4488
--- /dev/null
+++ b/contrib/test_crc32c/sql/test_crc32c.sql
@@ -0,0 +1,3 @@
+CREATE EXTENSION test_crc32c;
+
+select drive_crc32c(1, i) from generate_series(100, 300, 4) i;
diff --git a/contrib/test_crc32c/test_crc32c--1.0.sql b/contrib/test_crc32c/test_crc32c--1.0.sql
new file mode 100644
index 00000000000..52b9772f908
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c--1.0.sql
@@ -0,0 +1 @@
+CREATE FUNCTION drive_crc32c (count int, num int) RETURNS bigint AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/test_crc32c/test_crc32c.c b/contrib/test_crc32c/test_crc32c.c
new file mode 100644
index 00000000000..28bc42de314
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.c
@@ -0,0 +1,47 @@
+/* select drive_crc32c(1000000, 1024); */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "port/pg_crc32c.h"
+#include "common/pg_prng.h"
+
+PG_MODULE_MAGIC;
+
+/*
+ * drive_crc32c(count: int, num: int) returns bigint
+ *
+ * count is the nuimber of loops to perform
+ *
+ * num is the number byte in the buffer to calculate
+ * crc32c over.
+ */
+PG_FUNCTION_INFO_V1(drive_crc32c);
+Datum
+drive_crc32c(PG_FUNCTION_ARGS)
+{
+ int64 count = PG_GETARG_INT32(0);
+ int64 num = PG_GETARG_INT32(1);
+ char* data = malloc((size_t)num);
+ pg_crc32c crc;
+ pg_prng_state state;
+ uint64 seed = 42;
+ pg_prng_seed(&state, seed);
+ /* set random data */
+ for (uint64 i = 0; i < num; i++)
+ {
+ data[i] = pg_prng_uint32(&state) % 255;
+ }
+
+ INIT_CRC32C(crc);
+
+ while(count--)
+ {
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, data, num);
+ FIN_CRC32C(crc);
+ }
+
+ free((void *)data);
+
+ PG_RETURN_INT64((int64_t)crc);
+}
diff --git a/contrib/test_crc32c/test_crc32c.control b/contrib/test_crc32c/test_crc32c.control
new file mode 100644
index 00000000000..878a077ee18
--- /dev/null
+++ b/contrib/test_crc32c/test_crc32c.control
@@ -0,0 +1,4 @@
+comment = 'test'
+default_version = '1.0'
+module_pathname = '$libdir/test_crc32c'
+relocatable = true
--
2.48.1
I've gone ahead and added the generated AVX-512 algorithm in v14-0005
+ pg_attribute_target("avx512vl,vpclmulqdq")
+ pg_crc32c
+ pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
I'm a little confused. Why is v14 missing the earlier versions of pclmul implementation that works without avx-512?
Raghuveer
On Wed, Mar 12, 2025 at 3:57 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
I've gone ahead and added the generated AVX-512 algorithm in v14-0005
I'm a little confused. Why is v14 missing the earlier versions of pclmul implementation that works without avx-512?
As I mentioned upthread, the 128-bit implementation regresses on Zen 2
up to at least 256 bytes.
--
John Naylor
Amazon Web Services
I've gone ahead and added the generated AVX-512 algorithm in
v14-0005
Here is my benchmark numbers (master v/s v14) on TGL (time in ms):
Buf-size-bytes Master v14
64 9.547 6.095
80 10.999 6.248
96 12.443 7.756
112 14.129 9.62
128 15.295 6.534
144 16.756 8.226
160 18.18 9.656
176 19.836 11.556
192 22.038 8.307
208 24.303 9.692
224 26.656 11.115
240 28.981 13.119
256 31.399 10.035
Raghuveer
On Thu, Mar 13, 2025 at 11:38 PM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
I've gone ahead and added the generated AVX-512 algorithm in
v14-0005Here is my benchmark numbers (master v/s v14) on TGL (time in ms):
Buf-size-bytes Master v14
64 9.547 6.095
...
256 31.399 10.035
Thanks for testing! Looks good. I'll take a look at the configure
checks soon, since I had some questions there.
--
John Naylor
Amazon Web Services
On Mon, Mar 24, 2025 at 6:37 PM John Naylor <johncnaylorls@gmail.com> wrote:
I'll take a look at the configure
checks soon, since I had some questions there.
I'm leaning towards a length limit for v15-0001 so that inlined
instructions are likely to be unrolled. Aside from lack of commit
message, I think that one is ready for commit soon-ish.
I'm feeling pretty good about 0002, but since there is still room for
cosmetic fiddling, I want to let it sit for a bit longer.
I felt the previous proposals for configure.ac were unnecessarily
invasive, and the message looked out of place, so I made configure.ac
more similar to master, using the AVX popcount stuff as a model. I
also went the extra step and added a separate AC_MSG_CHECKING for
vectorized CRC. I'm not sure we really need that, but this algorithm
is trivially adoptable to Arm so it might be welcome for visibility.
For Meson, I just made the CRC checking comment a bit more general,
since keeping up this level of detail would result a loss in
readability.
0003 is just to demonstrate on CI that we are in fact computing the
same answer as master. An earlier patch had some additional tests in
strings.sql but I have yet to dig those out.
--
John Naylor
Amazon Web Services
Attachments:
v15-0001-Inline-CRC-computation-for-fixed-length-input.patchtext/x-patch; charset=US-ASCII; name=v15-0001-Inline-CRC-computation-for-fixed-length-input.patchDownload
From cebf6a4b6ecec7fdc30678828ad9883149d9378b Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Fri, 28 Feb 2025 16:27:30 +0700
Subject: [PATCH v15 1/3] Inline CRC computation for fixed-length input
Use a simplified copy of the loop in pg_crc32c_sse42.c to avoid
moving code to a separate header.
---
src/include/port/pg_crc32c.h | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 65ebeacf4b1..0ab7513f523 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -43,12 +43,42 @@ typedef uint32 pg_crc32c;
#if defined(USE_SSE42_CRC32C)
/* Use Intel SSE4.2 instructions. */
+
+#include <nmmintrin.h>
+
#define COMP_CRC32C(crc, data, len) \
- ((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
+ ((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+pg_attribute_no_sanitize_alignment()
+static inline
+pg_crc32c
+pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
+{
+ if (__builtin_constant_p(len) && len < 32)
+ {
+ const unsigned char *p = data;
+
+ /*
+ * For small constant inputs, inline the computation to avoid a
+ * function call and allow the compiler to unroll loops.
+ */
+#if SIZEOF_VOID_P >= 8
+ for (; len >= 8; p += 8, len -= 8)
+ crc = _mm_crc32_u64(crc, *(const uint64 *) p);
+#endif
+ for (; len >= 4; p += 4, len -= 4)
+ crc = _mm_crc32_u32(crc, *(const uint32 *) p);
+ for (; len > 0; --len)
+ crc = _mm_crc32_u8(crc, *p++);
+ return crc;
+ }
+ else
+ return pg_comp_crc32c_sse42(crc, data, len);
+}
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
--
2.48.1
v15-0003-Add-debug-for-CI-XXX-not-for-commit.patchtext/x-patch; charset=US-ASCII; name=v15-0003-Add-debug-for-CI-XXX-not-for-commit.patchDownload
From c519bd870fbb719fe147984b1a2aeff81316172c Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 25 Mar 2025 15:19:16 +0700
Subject: [PATCH v15 3/3] Add debug for CI XXX not for commit
---
src/port/pg_crc32c_sse42.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index f392eb5b236..3b9c448486f 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -19,6 +19,8 @@
#include "port/pg_crc32c.h"
+#define DEBUG_CRC /* XXX not for commit, or at least comment out */
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
@@ -87,6 +89,9 @@ pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
pg_crc32c crc0 = crc;
size_t len = length;
const char *buf = data;
+#ifdef DEBUG_CRC
+ const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
+#endif
/* Align on cacheline boundary. WIP: The threshold needs testing. */
if (unlikely(len > 256))
@@ -139,7 +144,13 @@ pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
len = end - buf;
}
- return pg_comp_crc32c_sse42(crc0, buf, len);
+ crc0 = pg_comp_crc32c_sse42(crc0, buf, len);
+
+#ifdef DEBUG_CRC
+ Assert(crc0 == pg_comp_crc32c_sse42(crc, data, orig_len));
+#endif
+
+ return crc0;
}
#endif
--
2.48.1
v15-0002-Improve-CRC32C-performance-on-x86_64.patchtext/x-patch; charset=US-ASCII; name=v15-0002-Improve-CRC32C-performance-on-x86_64.patchDownload
From 1fc0fd60062446307bda6c60619344d8588c8125 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 25 Mar 2025 19:22:32 +0700
Subject: [PATCH v15 2/3] Improve CRC32C performance on x86_64
The current SSE4.2 implementation of CRC32C relies on the native
CRC32 instruction, which operates on 8 bytes at a time. We can get a
substantial speedup on longer inputs by using carryless multiplication
on SIMD registers, processing 64 bytes per loop iteration.
The VPCLMULQDQ instruction on 512-bit registers has been available
on Intel hardware since 2019 and AMD since 2022. There is an older
variant for 128-bit registers, but at least on Zen 2 it performs
worse than normal CRC instructions for short inputs. (Thanks to
David Rowley for testing on that platform.)
We must now do a runtime check, even for builds that target SSE
4.2. This doesn't matter in practice for WAL (arguably the most
critical case), because with commit XXXYYYZZZ the final computation
with the 20-byte WAL header is inlined. Compared with two function
calls, testing showed equal or slightly faster performance in
performing an indirect function call on several dozen bytes followed
by inlined instructions on constant input of 20 bytes.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Author: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Author: Paul Amonson <paul.d.amonson@intel.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Matthew Sterrett <matthewsterrett2@gmail.com> (earlier version)
Tested-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Discussion: https://postgr.es/m/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
config/c-compiler.m4 | 37 +++++++++++++++
configure | 72 +++++++++++++++++++++++++++--
configure.ac | 20 +++++++--
meson.build | 55 ++++++++++++++++++-----
src/include/pg_config.h.in | 3 ++
src/include/port/pg_crc32c.h | 39 +++++++++++-----
src/port/meson.build | 1 +
src/port/pg_crc32c_sse42.c | 75 +++++++++++++++++++++++++++++++
src/port/pg_crc32c_sse42_choose.c | 75 ++++++++++++++++++++++++-------
9 files changed, 333 insertions(+), 44 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 3712e81e38c..52b65406f88 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -581,6 +581,43 @@ fi
undefine([Ac_cachevar])dnl
])# PGAC_SSE42_CRC32_INTRINSICS
+# PGAC_AVX512_CRC32_INTRINSICS
+# ---------------------------
+# Check if the compiler supports the carryless multiplication
+# and AVX-512VL instructions used for computing CRC32C with
+# 512-bit registers. AVX-512F is assumed to be supported.
+#
+# If the intrinsics are supported, sets pgac_avx512_crc32_intrinsics.
+AC_DEFUN([PGAC_AVX512_CRC32_INTRINSICS],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_crc32_intrinsics])])dnl
+AC_CACHE_CHECK([for _mm512_clmulepi64_epi128], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+ __m512i x;
+ __m512i y;
+
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("vpclmulqdq,avx512vl")))
+ #endif
+ static int crc32_avx512_test(void)
+ {
+ __m128i z;
+
+ y = _mm512_clmulepi64_epi128(x, y, 0);
+ z = _mm_ternarylogic_epi64(
+ _mm512_castsi512_si128(y),
+ _mm512_extracti32x4_epi32(y, 1),
+ _mm512_extracti32x4_epi32(y, 2),
+ 0x96);
+ return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
+ }],
+ [return crc32_avx512_test();])],
+ [Ac_cachevar=yes],
+ [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+ pgac_avx512_crc32_intrinsics=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX512_CRC32_INTRINSICS
# PGAC_ARMV8_CRC32C_INTRINSICS
# ----------------------------
diff --git a/configure b/configure
index fac1e9a4e39..abbb151549c 100755
--- a/configure
+++ b/configure
@@ -17378,6 +17378,59 @@ $as_echo "#define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1" >>confdefs.h
fi
fi
+# Check for intrinsics to do vectorized CRC calculations.
+#
+if test x"$host_cpu" = x"x86_64"; then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for _mm512_clmulepi64_epi128" >&5
+$as_echo_n "checking for _mm512_clmulepi64_epi128... " >&6; }
+if ${pgac_cv_avx512_crc32_intrinsics+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+#include <immintrin.h>
+ __m512i x;
+ __m512i y;
+
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("vpclmulqdq,avx512vl")))
+ #endif
+ static int crc32_avx512_test(void)
+ {
+ __m128i z;
+
+ y = _mm512_clmulepi64_epi128(x, y, 0);
+ z = _mm_ternarylogic_epi64(
+ _mm512_castsi512_si128(y),
+ _mm512_extracti32x4_epi32(y, 1),
+ _mm512_extracti32x4_epi32(y, 2),
+ 0x96);
+ return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
+ }
+int
+main ()
+{
+return crc32_avx512_test();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ pgac_cv_avx512_crc32_intrinsics=yes
+else
+ pgac_cv_avx512_crc32_intrinsics=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx512_crc32_intrinsics" >&5
+$as_echo "$pgac_cv_avx512_crc32_intrinsics" >&6; }
+if test x"$pgac_cv_avx512_crc32_intrinsics" = x"yes"; then
+ pgac_avx512_crc32_intrinsics=yes
+fi
+
+fi
+
# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
#
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for _mm_crc32_u8 and _mm_crc32_u32" >&5
@@ -17622,9 +17675,8 @@ fi
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
# use the special CRC instructions for calculating CRC-32C. If we're not
# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
+# the SSE/AVX-512 intrinsics compile both implementations and select which one
+# to use at runtime, depending runtime cpuid information.
#
# Similarly, if we are targeting an ARM processor that has the CRC
# instructions that are part of the ARMv8 CRC Extension, use them. And if
@@ -17680,7 +17732,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
$as_echo "SSE 4.2" >&6; }
else
@@ -17729,6 +17781,18 @@ $as_echo "slicing-by-8" >&6; }
fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking which vectorized CRC-32C implementation to use" >&5
+$as_echo_n "checking which vectorized CRC-32C implementation to use... " >&6; }
+if test x"$pgac_avx512_crc32_intrinsics" = x"yes"; then
+
+$as_echo "#define USE_AVX512_CRC_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: AVX-512 with runtime check" >&5
+$as_echo "AVX-512 with runtime check" >&6; }
+else
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: none" >&5
+$as_echo "none" >&6; }
+fi
# Select semaphore implementation type.
if test "$PORTNAME" != "win32"; then
diff --git a/configure.ac b/configure.ac
index b6d02f5ecc7..a037b8cd566 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2057,6 +2057,12 @@ if test x"$host_cpu" = x"x86_64"; then
fi
fi
+# Check for intrinsics to do vectorized CRC calculations.
+#
+if test x"$host_cpu" = x"x86_64"; then
+ PGAC_AVX512_CRC32_INTRINSICS()
+fi
+
# Check for Intel SSE 4.2 intrinsics to do CRC calculations.
#
PGAC_SSE42_CRC32_INTRINSICS()
@@ -2096,9 +2102,8 @@ AC_SUBST(CFLAGS_CRC)
# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
# use the special CRC instructions for calculating CRC-32C. If we're not
# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
+# the SSE/AVX-512 intrinsics compile both implementations and select which one
+# to use at runtime, depending runtime cpuid information.
#
# Similarly, if we are targeting an ARM processor that has the CRC
# instructions that are part of the ARMv8 CRC Extension, use them. And if
@@ -2151,7 +2156,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
@@ -2184,6 +2189,13 @@ else
fi
AC_SUBST(PG_CRC32C_OBJS)
+AC_MSG_CHECKING([which vectorized CRC-32C implementation to use])
+if test x"$pgac_avx512_crc32_intrinsics" = x"yes"; then
+ AC_DEFINE(USE_AVX512_CRC_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX-512 CRC instructions with a runtime check.])
+ AC_MSG_RESULT(AVX-512 with runtime check)
+else
+ AC_MSG_RESULT(none)
+fi
# Select semaphore implementation type.
if test "$PORTNAME" != "win32"; then
diff --git a/meson.build b/meson.build
index 7cf518a2765..38736bcf853 100644
--- a/meson.build
+++ b/meson.build
@@ -2288,17 +2288,22 @@ endif
###############################################################
# Select CRC-32C implementation.
#
-# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
-# use the special CRC instructions for calculating CRC-32C. If we're not
-# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
+# There are three methods of calculating CRC, in order of increasing
+# performance:
#
-# Similarly, if we are targeting an ARM processor that has the CRC
-# instructions that are part of the ARMv8 CRC Extension, use them. And if
-# we're not targeting such a processor, but can nevertheless produce code that
-# uses the CRC instructions, compile both, and select at runtime.
+# 1. The fallback using a lookup table, called slicing-by-8
+# 2. CRC-32C instructions on word-sized registers (e.g. Intel SSE 4.2
+# and ARMv8 CRC Extension)
+# 3. Algorithms that have at their core carryless multiplication
+# instructions (called PCLMUL or PMULL) on vector registers.
+#
+# If we can produce code (via function attributes or additional compiler
+# flags) that uses #2 (and possibly #3), we compile all implementations
+# and select which one to use at runtime, depending on what is supported
+# by the processor we're running on.
+#
+# If we are targeting a processor that has #2, we can use that without
+# runtime selection.
#
# Note that we do not use __attribute__((target("..."))) for the ARM CRC
# instructions because until clang 16, using the ARM intrinsics still requires
@@ -2347,6 +2352,36 @@ int main(void)
have_optimized_crc = true
endif
+ # Test for PCLMUL intrinsics on 512-bit registers
+ prog = '''
+#include <immintrin.h>
+__m512i x;
+__m512i y;
+
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx512vl,vpclmulqdq")))
+#endif
+int main(void)
+{
+ __m128i z;
+
+ y = _mm512_clmulepi64_epi128(x, y, 0);
+ z = _mm_ternarylogic_epi64(
+ _mm512_castsi512_si128(y),
+ _mm512_extracti32x4_epi32(y, 1),
+ _mm512_extracti32x4_epi32(y, 2),
+ 0x96);
+ /* return computed value, to prevent the above being optimized away */
+ return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
+}
+'''
+
+ if cc.links(prog,
+ name: 'AVX-512 CRC32C',
+ args: test_c_args)
+ cdata.set('USE_AVX512_CRC32C_WITH_RUNTIME_CHECK', 1)
+ endif
+
endif
elif host_cpu == 'arm' or host_cpu == 'aarch64'
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index db6454090d2..5bd5a927742 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -651,6 +651,9 @@
/* Define to 1 to build with assertion checks. (--enable-cassert) */
#undef USE_ASSERT_CHECKING
+/* Define to 1 to use AVX-512 CRC instructions with a runtime check. */
+#undef USE_AVX512_CRC_WITH_RUNTIME_CHECK
+
/* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
#undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 0ab7513f523..17dec5f6007 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -42,7 +42,10 @@ typedef uint32 pg_crc32c;
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
#if defined(USE_SSE42_CRC32C)
-/* Use Intel SSE4.2 instructions. */
+/*
+ * Use either Intel SSE 4.2 or PCLMUL instructions. We don't need a runtime check
+ * for SSE 4.2, so we can inline those in some cases.
+ */
#include <nmmintrin.h>
@@ -50,7 +53,11 @@ typedef uint32 pg_crc32c;
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
pg_attribute_no_sanitize_alignment()
static inline
@@ -76,9 +83,27 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
return crc;
}
else
- return pg_comp_crc32c_sse42(crc, data, len);
+ /* Otherwise, use a runtime check for PCLMUL instructions. */
+ return pg_comp_crc32c(crc, data, len);
}
+#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+
+/*
+ * Use Intel SSE 4.2 or PCLMUL instructions, but perform a runtime check first
+ * to check that they are available.
+ */
+#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c((crc), (data), (len)))
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern PGDLLIMPORT pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t len);
+#endif
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
@@ -97,10 +122,10 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+#elif defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
+ * Use ARMv8 instructions, but perform a runtime check first
* to check that they are available.
*/
#define COMP_CRC32C(crc, data, len) \
@@ -109,13 +134,7 @@ extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
#else
/*
diff --git a/src/port/meson.build b/src/port/meson.build
index 7fcfa728d43..8d70a4d510e 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -83,6 +83,7 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df31..f392eb5b236 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -15,6 +15,7 @@
#include "c.h"
#include <nmmintrin.h>
+#include <immintrin.h>
#include "port/pg_crc32c.h"
@@ -68,3 +69,77 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i avx512_vpclmulqdq -p crc32c -a v1e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm512_clmulepi64_epi128((a), (b), 0))
+#define clmul_hi(a, b) (_mm512_clmulepi64_epi128((a), (b), 17))
+
+pg_attribute_target("avx512vl,vpclmulqdq")
+pg_crc32c
+pg_comp_crc32c_pclmul(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ /* Align on cacheline boundary. WIP: The threshold needs testing. */
+ if (unlikely(len > 256))
+ {
+ for (; len && ((uintptr_t) buf & 7); --len)
+ {
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ }
+ while (((uintptr_t) buf & 56) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+ }
+
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+ __m128i z0;
+
+ /* First vector chunk. */
+ __m512i x0 = _mm512_loadu_si512((const void *) buf),
+ y0;
+ __m512i k;
+
+ k = _mm512_broadcast_i32x4(_mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0));
+ x0 = _mm512_xor_si512(_mm512_castsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
+ buf += 64;
+
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ x0 = _mm512_ternarylogic_epi64(x0, y0, _mm512_loadu_si512((const void *) buf), 0x96);
+ buf += 64;
+ }
+
+ /* Reduce 512 bits to 128 bits. */
+ k = _mm512_setr_epi32(0x1c291d04, 0, 0xddc0152b, 0, 0x3da6d0cb, 0, 0xba4fc28e, 0, 0xf20c0dfe, 0, 0x493c7d27, 0, 0, 0, 0, 0);
+ y0 = clmul_lo(x0, k), k = clmul_hi(x0, k);
+ y0 = _mm512_xor_si512(y0, k);
+ z0 = _mm_ternarylogic_epi64(_mm512_castsi512_si128(y0), _mm512_extracti32x4_epi32(y0, 1), _mm512_extracti32x4_epi32(y0, 2), 0x96);
+ z0 = _mm_xor_si128(z0, _mm512_extracti32x4_epi32(x0, 3));
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(z0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(z0, 1));
+ len = end - buf;
+ }
+
+ return pg_comp_crc32c_sse42(crc0, buf, len);
+}
+
+#endif
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d4249..c2d25253c2c 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -20,30 +20,35 @@
#include "c.h"
-#ifdef HAVE__GET_CPUID
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
#include <cpuid.h>
#endif
-#ifdef HAVE__CPUID
+#include <immintrin.h>
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
#include <intrin.h>
#endif
#include "port/pg_crc32c.h"
-static bool
-pg_crc32c_sse42_available(void)
+/*
+ * Does XGETBV say the ZMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that osxsave is available
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+zmm_regs_available(void)
{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0xe6) == 0xe6;
#else
-#error cpuid instruction not available
+ return false;
#endif
-
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
/*
@@ -53,10 +58,48 @@ pg_crc32c_sse42_available(void)
static pg_crc32c
pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
- if (pg_crc32c_sse42_available())
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
+
+ /*
+ * Set fallback. We must guard since slicing-by-8 is not visible
+ * everywhere.
+ */
+ pg_comp_crc32c = pg_comp_crc32c_sb8;
+#endif
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+ if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
+ {
pg_comp_crc32c = pg_comp_crc32c_sse42;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
+
+ if (exx[2] & (1 << 27) && /* OSXSAVE */
+ zmm_regs_available())
+ {
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+ /* second cpuid call on leaf 7 to check extended avx512 support */
+
+ memset(exx, 0, 4 * sizeof(exx[0]));
+
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, 7, 0);
+#endif
+ if (exx[2] & (1 << 10) && /* VPCLMULQDQ */
+ exx[1] & (1 << 31)) /* AVX512-VL */
+ pg_comp_crc32c = pg_comp_crc32c_pclmul;
+#endif
+ }
+ }
return pg_comp_crc32c(crc, data, len);
}
--
2.48.1
On Mon, Mar 24, 2025 at 6:37 PM John Naylor <johncnaylorls@gmail.com> wrote:
I'll take a look at the configure
checks soon, since I had some questions there.
One other thing I forgot to mention: The previous test function had
local constants that the compiler was able to fold, resulting in no
actual vector instructions being emitted:
movabs rdx, 12884901891
xor eax, eax
crc32 rax, rdx
crc32 rax, rdx
ret
That may be okay for practical purposes, but in the spirit of commit
fdb5dd6331e30 I changed it in v15 to use global variables and made
sure it emits what the function attributes are intended for:
vmovdqu64 zmm3, ZMMWORD PTR x[rip]
xor eax, eax
vpclmulqdq zmm0, zmm3, ZMMWORD PTR y[rip], 0
vextracti32x4 xmm2, zmm0, 1
vmovdqa64 xmm1, xmm0
vmovdqu64 ZMMWORD PTR y[rip], zmm0
vextracti32x4 xmm0, zmm0, 2
vpternlogq xmm1, xmm2, xmm0, 150
vmovq rdx, xmm1
crc32 rax, rdx
vzeroupper
ret
--
John Naylor
Amazon Web Services
Hello John,
v15 LGTM. Couple of minor comments:
I'm leaning towards a length limit for v15-0001 so that inlined instructions are
likely to be unrolled. Aside from lack of commit message, I think that one is ready
for commit soon-ish.
+1
I'm feeling pretty good about 0002, but since there is still room for cosmetic
fiddling, I want to let it sit for a bit longer.
(1) zmm_regs_available() in pg_crc32c_sse42_choose.c is a duplicate of pg_popcount_avx512.c but perhaps that’s fine for now. I will propose a consolidated SIMD runtime check in a follow up patch.
(2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c since they contain both sse42 and avx512 versions.
I felt the previous proposals for configure.ac were unnecessarily invasive, and the
message looked out of place, so I made configure.ac more similar to master, using
the AVX popcount stuff as a model.
Looks good to me.
Raghuveer
On Thu, Mar 27, 2025 at 2:55 AM Devulapalli, Raghuveer
<raghuveer.devulapalli@intel.com> wrote:
Hello John,
v15 LGTM. Couple of minor comments:
I'm leaning towards a length limit for v15-0001 so that inlined instructions are
likely to be unrolled. Aside from lack of commit message, I think that one is ready
for commit soon-ish.+1
Thanks for looking! This has been committed.
I'm feeling pretty good about 0002, but since there is still room for cosmetic
fiddling, I want to let it sit for a bit longer.(1) zmm_regs_available() in pg_crc32c_sse42_choose.c is a duplicate of pg_popcount_avx512.c but perhaps that’s fine for now. I will propose a consolidated SIMD runtime check in a follow up patch.
Yeah, I was thinking a small amount of duplication is tolerable.
As far as possible better abstraction for next cycle, I went looking
and found this, which has some features we've talked about:
https://zolk3ri.name/cgit/cpudetect/tree/cpudetect.h
(2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c since they contain both sse42 and avx512 versions.
The name is now not quite accurate, but it's not exactly misleading
either. I'm leaning towards keeping it the same, so for now I've just
updated the header comment.
For v16, I made another pass through made some more mostly superficial
adjustments:
- copied rewritten Meson comment to configure.ac
- added some more #include guards out of paranoia
- added tests with longer lengths that exercise the new paths
- adjusted configure / meson messages for consistency
- changed not-quite-accurate wording about "AVX-512 CRC instructions"
- used "PCLMUL" only when talking about specific intrinsics and prefer
"AVX-512" elsewhere, to head off potential future confusion with Arm
PMULL.
--
John Naylor
Amazon Web Services
Attachments:
v16-0001-Improve-CRC32C-performance-on-recent-x86_64-plat.patchapplication/x-patch; name=v16-0001-Improve-CRC32C-performance-on-recent-x86_64-plat.patchDownload
From b3af802cf28cdc0937e163dbba005f823d74e0d0 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 25 Mar 2025 19:22:32 +0700
Subject: [PATCH v16 1/2] Improve CRC32C performance on recent x86_64 platforms
The previous implementation of CRC32C on x86 relied on the native
CRC32 instruction from the SSE 4.2 extension, which operates on
up to 8 bytes at a time. We can get a substantial speedup by using
carryless multiplication on SIMD registers, processing 64 bytes per
loop iteration. Shorter inputs fall back to ordinary CRC instructions.
The VPCLMULQDQ instruction on 512-bit registers has been available
on Intel hardware since 2019 and AMD since 2022. There is an older
variant for 128-bit registers, but at least on Zen 2 it performs worse
than normal CRC instructions for short inputs. (Thanks to David Rowley
for testing on that platform.)
We must now do a runtime check, even for builds that target SSE
4.2. This doesn't matter in practice for WAL (arguably the most
critical case), because since commit af0c24855 the final computation
with the 20-byte WAL header is inlined and unrolled. Compared with
two direct function calls, testing showed equal or slightly faster
performance in performing an indirect function call on several dozen
bytes followed by inlined instructions on constant input of 20 bytes.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Co-authored-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Co-authored-by: Paul Amonson <paul.d.amonson@intel.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Tested-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Tested-by: Matthew Sterrett <matthewsterrett2@gmail.com> (earlier version)
Discussion: https://postgr.es/m/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
---
config/c-compiler.m4 | 37 +++++++++++
configure | 91 ++++++++++++++++++++++----
configure.ac | 41 ++++++++----
meson.build | 58 +++++++++++++----
src/include/pg_config.h.in | 3 +
src/include/port/pg_crc32c.h | 39 ++++++++---
src/port/meson.build | 1 +
src/port/pg_crc32c_sse42.c | 94 ++++++++++++++++++++++++++-
src/port/pg_crc32c_sse42_choose.c | 75 ++++++++++++++++-----
src/test/regress/expected/strings.out | 24 +++++++
src/test/regress/sql/strings.sql | 5 ++
11 files changed, 408 insertions(+), 60 deletions(-)
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index e9e54470e66..4c4863f6de0 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -581,6 +581,43 @@ fi
undefine([Ac_cachevar])dnl
])# PGAC_SSE42_CRC32_INTRINSICS
+# PGAC_AVX512_PCLMUL_INTRINSICS
+# ---------------------------
+# Check if the compiler supports AVX-512 carryless multiplication
+# and AVX-512VL instructions used for computing CRC. AVX-512F is
+# assumed to be supported if the above are.
+#
+# If the intrinsics are supported, sets pgac_avx512_pclmul_intrinsics.
+AC_DEFUN([PGAC_AVX512_PCLMUL_INTRINSICS],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_pclmul_intrinsics])])dnl
+AC_CACHE_CHECK([for _mm512_clmulepi64_epi128], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+ __m512i x;
+ __m512i y;
+
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("vpclmulqdq,avx512vl")))
+ #endif
+ static int avx512_pclmul_test(void)
+ {
+ __m128i z;
+
+ y = _mm512_clmulepi64_epi128(x, y, 0);
+ z = _mm_ternarylogic_epi64(
+ _mm512_castsi512_si128(y),
+ _mm512_extracti32x4_epi32(y, 1),
+ _mm512_extracti32x4_epi32(y, 2),
+ 0x96);
+ return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
+ }],
+ [return avx512_pclmul_test();])],
+ [Ac_cachevar=yes],
+ [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+ pgac_avx512_pclmul_intrinsics=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX512_PCLMUL_INTRINSICS
# PGAC_ARMV8_CRC32C_INTRINSICS
# ----------------------------
diff --git a/configure b/configure
index 30d949c3c46..03cdae7423c 100755
--- a/configure
+++ b/configure
@@ -17829,17 +17829,21 @@ fi
# Select CRC-32C implementation.
#
-# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
-# use the special CRC instructions for calculating CRC-32C. If we're not
-# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
+# There are three methods of calculating CRC, in order of increasing
+# performance:
#
-# Similarly, if we are targeting an ARM processor that has the CRC
-# instructions that are part of the ARMv8 CRC Extension, use them. And if
-# we're not targeting such a processor, but can nevertheless produce code that
-# uses the CRC instructions, compile both, and select at runtime.
+# 1. The fallback using a lookup table, called slicing-by-8
+# 2. CRC-32C instructions (found in e.g. Intel SSE 4.2 and ARMv8 CRC Extension)
+# 3. Algorithms using carryless multiplication instructions
+# (e.g. Intel PCLMUL and Arm PMULL)
+#
+# If we can produce code (via function attributes or additional compiler
+# flags) that uses #2 (and possibly #3), we compile all implementations
+# and select which one to use at runtime, depending on what is supported
+# by the processor we're running on.
+#
+# If we are targeting a processor that has #2, we can use that without
+# runtime selection.
#
# Note that we do not use __attribute__((target("..."))) for the ARM CRC
# instructions because until clang 16, using the ARM intrinsics still requires
@@ -17890,7 +17894,7 @@ if test x"$USE_SSE42_CRC32C" = x"1"; then
$as_echo "#define USE_SSE42_CRC32C 1" >>confdefs.h
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
$as_echo "SSE 4.2" >&6; }
else
@@ -17939,6 +17943,71 @@ $as_echo "slicing-by-8" >&6; }
fi
+# Check for carryless multiplication intrinsics to do vectorized CRC calculations.
+#
+if test x"$host_cpu" = x"x86_64"; then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for _mm512_clmulepi64_epi128" >&5
+$as_echo_n "checking for _mm512_clmulepi64_epi128... " >&6; }
+if ${pgac_cv_avx512_pclmul_intrinsics+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+#include <immintrin.h>
+ __m512i x;
+ __m512i y;
+
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("vpclmulqdq,avx512vl")))
+ #endif
+ static int avx512_pclmul_test(void)
+ {
+ __m128i z;
+
+ y = _mm512_clmulepi64_epi128(x, y, 0);
+ z = _mm_ternarylogic_epi64(
+ _mm512_castsi512_si128(y),
+ _mm512_extracti32x4_epi32(y, 1),
+ _mm512_extracti32x4_epi32(y, 2),
+ 0x96);
+ return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
+ }
+int
+main ()
+{
+return avx512_pclmul_test();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ pgac_cv_avx512_pclmul_intrinsics=yes
+else
+ pgac_cv_avx512_pclmul_intrinsics=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx512_pclmul_intrinsics" >&5
+$as_echo "$pgac_cv_avx512_pclmul_intrinsics" >&6; }
+if test x"$pgac_cv_avx512_pclmul_intrinsics" = x"yes"; then
+ pgac_avx512_pclmul_intrinsics=yes
+fi
+
+fi
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for vectorized CRC-32C" >&5
+$as_echo_n "checking for vectorized CRC-32C... " >&6; }
+if test x"$pgac_avx512_pclmul_intrinsics" = x"yes"; then
+
+$as_echo "#define USE_AVX512_CRC_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: AVX-512 with runtime check" >&5
+$as_echo "AVX-512 with runtime check" >&6; }
+else
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: none" >&5
+$as_echo "none" >&6; }
+fi
# Select semaphore implementation type.
if test "$PORTNAME" != "win32"; then
diff --git a/configure.ac b/configure.ac
index 25cdfcf65af..0bce658f296 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2115,17 +2115,21 @@ AC_SUBST(CFLAGS_CRC)
# Select CRC-32C implementation.
#
-# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
-# use the special CRC instructions for calculating CRC-32C. If we're not
-# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
-#
-# Similarly, if we are targeting an ARM processor that has the CRC
-# instructions that are part of the ARMv8 CRC Extension, use them. And if
-# we're not targeting such a processor, but can nevertheless produce code that
-# uses the CRC instructions, compile both, and select at runtime.
+# There are three methods of calculating CRC, in order of increasing
+# performance:
+#
+# 1. The fallback using a lookup table, called slicing-by-8
+# 2. CRC-32C instructions (found in e.g. Intel SSE 4.2 and ARMv8 CRC Extension)
+# 3. Algorithms using carryless multiplication instructions
+# (e.g. Intel PCLMUL and Arm PMULL)
+#
+# If we can produce code (via function attributes or additional compiler
+# flags) that uses #2 (and possibly #3), we compile all implementations
+# and select which one to use at runtime, depending on what is supported
+# by the processor we're running on.
+#
+# If we are targeting a processor that has #2, we can use that without
+# runtime selection.
#
# Note that we do not use __attribute__((target("..."))) for the ARM CRC
# instructions because until clang 16, using the ARM intrinsics still requires
@@ -2173,7 +2177,7 @@ fi
AC_MSG_CHECKING([which CRC-32C implementation to use])
if test x"$USE_SSE42_CRC32C" = x"1"; then
AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
- PG_CRC32C_OBJS="pg_crc32c_sse42.o"
+ PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
AC_MSG_RESULT(SSE 4.2)
else
if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
@@ -2206,6 +2210,19 @@ else
fi
AC_SUBST(PG_CRC32C_OBJS)
+# Check for carryless multiplication intrinsics to do vectorized CRC calculations.
+#
+if test x"$host_cpu" = x"x86_64"; then
+ PGAC_AVX512_PCLMUL_INTRINSICS()
+fi
+
+AC_MSG_CHECKING([for vectorized CRC-32C])
+if test x"$pgac_avx512_pclmul_intrinsics" = x"yes"; then
+ AC_DEFINE(USE_AVX512_CRC_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX-512 CRC algorithms with a runtime check.])
+ AC_MSG_RESULT(AVX-512 with runtime check)
+else
+ AC_MSG_RESULT(none)
+fi
# Select semaphore implementation type.
if test "$PORTNAME" != "win32"; then
diff --git a/meson.build b/meson.build
index b8da4966297..a42f54bf493 100644
--- a/meson.build
+++ b/meson.build
@@ -2348,17 +2348,21 @@ endif
###############################################################
# Select CRC-32C implementation.
#
-# If we are targeting a processor that has Intel SSE 4.2 instructions, we can
-# use the special CRC instructions for calculating CRC-32C. If we're not
-# targeting such a processor, but we can nevertheless produce code that uses
-# the SSE intrinsics, compile both implementations and select which one to use
-# at runtime, depending on whether SSE 4.2 is supported by the processor we're
-# running on.
+# There are three methods of calculating CRC, in order of increasing
+# performance:
#
-# Similarly, if we are targeting an ARM processor that has the CRC
-# instructions that are part of the ARMv8 CRC Extension, use them. And if
-# we're not targeting such a processor, but can nevertheless produce code that
-# uses the CRC instructions, compile both, and select at runtime.
+# 1. The fallback using a lookup table, called slicing-by-8
+# 2. CRC-32C instructions (found in e.g. Intel SSE 4.2 and ARMv8 CRC Extension)
+# 3. Algorithms using carryless multiplication instructions
+# (e.g. Intel PCLMUL and Arm PMULL)
+#
+# If we can produce code (via function attributes or additional compiler
+# flags) that uses #2 (and possibly #3), we compile all implementations
+# and select which one to use at runtime, depending on what is supported
+# by the processor we're running on.
+#
+# If we are targeting a processor that has #2, we can use that without
+# runtime selection.
#
# Note that we do not use __attribute__((target("..."))) for the ARM CRC
# instructions because until clang 16, using the ARM intrinsics still requires
@@ -2392,7 +2396,7 @@ int main(void)
}
'''
- if not cc.links(prog, name: '_mm_crc32_u8 and _mm_crc32_u32',
+ if not cc.links(prog, name: 'SSE 4.2 CRC32C',
args: test_c_args)
# Do not use Intel SSE 4.2
elif (cc.get_define('__SSE4_2__') != '')
@@ -2407,6 +2411,38 @@ int main(void)
have_optimized_crc = true
endif
+ # Check if the compiler supports AVX-512 carryless multiplication
+ # and AVX-512VL instructions used for computing CRC. AVX-512F is
+ # assumed to be supported if the above are.
+ prog = '''
+#include <immintrin.h>
+__m512i x;
+__m512i y;
+
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("vpclmulqdq,avx512vl")))
+#endif
+int main(void)
+{
+ __m128i z;
+
+ y = _mm512_clmulepi64_epi128(x, y, 0);
+ z = _mm_ternarylogic_epi64(
+ _mm512_castsi512_si128(y),
+ _mm512_extracti32x4_epi32(y, 1),
+ _mm512_extracti32x4_epi32(y, 2),
+ 0x96);
+ /* return computed value, to prevent the above being optimized away */
+ return _mm_crc32_u64(0, _mm_extract_epi64(z, 0));
+}
+'''
+
+ if cc.links(prog,
+ name: 'AVX-512 CRC32C',
+ args: test_c_args)
+ cdata.set('USE_AVX512_CRC32C_WITH_RUNTIME_CHECK', 1)
+ endif
+
endif
elif host_cpu == 'arm' or host_cpu == 'aarch64'
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 92f0616c400..97f2f8c19cc 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -654,6 +654,9 @@
/* Define to 1 to build with assertion checks. (--enable-cassert) */
#undef USE_ASSERT_CHECKING
+/* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */
+#undef USE_AVX512_CRC_WITH_RUNTIME_CHECK
+
/* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
#undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
diff --git a/src/include/port/pg_crc32c.h b/src/include/port/pg_crc32c.h
index 9376d223fef..82313bb7fcf 100644
--- a/src/include/port/pg_crc32c.h
+++ b/src/include/port/pg_crc32c.h
@@ -42,7 +42,10 @@ typedef uint32 pg_crc32c;
#define EQ_CRC32C(c1, c2) ((c1) == (c2))
#if defined(USE_SSE42_CRC32C)
-/* Use Intel SSE4.2 instructions. */
+/*
+ * Use either Intel SSE 4.2 or AVX-512 instructions. We don't need a runtime check
+ * for SSE 4.2, so we can inline those in some cases.
+ */
#include <nmmintrin.h>
@@ -50,7 +53,11 @@ typedef uint32 pg_crc32c;
((crc) = pg_comp_crc32c_dispatch((crc), (data), (len)))
#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_avx512(pg_crc32c crc, const void *data, size_t len);
+#endif
/*
* We can only get here if the host compiler targets SSE 4.2, but on some
@@ -82,9 +89,27 @@ pg_comp_crc32c_dispatch(pg_crc32c crc, const void *data, size_t len)
return crc;
}
else
- return pg_comp_crc32c_sse42(crc, data, len);
+ /* Otherwise, use a runtime check for AVX-512 instructions. */
+ return pg_comp_crc32c(crc, data, len);
}
+#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK)
+
+/*
+ * Use Intel SSE 4.2 or AVX-512 instructions, but perform a runtime check first
+ * to check that they are available.
+ */
+#define COMP_CRC32C(crc, data, len) \
+ ((crc) = pg_comp_crc32c((crc), (data), (len)))
+#define FIN_CRC32C(crc) ((crc) ^= 0xFFFFFFFF)
+
+extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
+extern PGDLLIMPORT pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
+extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+extern pg_crc32c pg_comp_crc32c_avx512(pg_crc32c crc, const void *data, size_t len);
+#endif
+
#elif defined(USE_ARMV8_CRC32C)
/* Use ARMv8 CRC Extension instructions. */
@@ -103,10 +128,10 @@ extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t le
extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_t len);
-#elif defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK) || defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
+#elif defined(USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK)
/*
- * Use Intel SSE 4.2 or ARMv8 instructions, but perform a runtime check first
+ * Use ARMv8 instructions, but perform a runtime check first
* to check that they are available.
*/
#define COMP_CRC32C(crc, data, len) \
@@ -115,13 +140,7 @@ extern pg_crc32c pg_comp_crc32c_loongarch(pg_crc32c crc, const void *data, size_
extern pg_crc32c pg_comp_crc32c_sb8(pg_crc32c crc, const void *data, size_t len);
extern pg_crc32c (*pg_comp_crc32c) (pg_crc32c crc, const void *data, size_t len);
-
-#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
-extern pg_crc32c pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len);
-#endif
-#ifdef USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK
extern pg_crc32c pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len);
-#endif
#else
/*
diff --git a/src/port/meson.build b/src/port/meson.build
index cf7f07644b9..2d2bf1b70f7 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -85,6 +85,7 @@ replace_funcs_pos = [
# x86/x64
['pg_crc32c_sse42', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
+ ['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C'],
['pg_crc32c_sse42_choose', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
['pg_crc32c_sb8', 'USE_SSE42_CRC32C_WITH_RUNTIME_CHECK'],
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 22c2137df31..4d8dec004b1 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -1,7 +1,7 @@
/*-------------------------------------------------------------------------
*
* pg_crc32c_sse42.c
- * Compute CRC-32C checksum using Intel SSE 4.2 instructions.
+ * Compute CRC-32C checksum using Intel SSE 4.2 or AVX-512 instructions.
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -15,6 +15,9 @@
#include "c.h"
#include <nmmintrin.h>
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+#include <immintrin.h>
+#endif
#include "port/pg_crc32c.h"
@@ -68,3 +71,92 @@ pg_comp_crc32c_sse42(pg_crc32c crc, const void *data, size_t len)
return crc;
}
+
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+
+/*
+ * Note: There is no copyright notice in the following generated code.
+ *
+ * We have modified the output to
+ * - match our function declaration
+ * - make whitespace match project style
+ * - add a threshold for the alignment stanza
+ */
+
+/* Generated by https://github.com/corsix/fast-crc32/ using: */
+/* ./generate -i avx512_vpclmulqdq -p crc32c -a v1e */
+/* MIT licensed */
+
+#define clmul_lo(a, b) (_mm512_clmulepi64_epi128((a), (b), 0))
+#define clmul_hi(a, b) (_mm512_clmulepi64_epi128((a), (b), 17))
+
+pg_attribute_target("vpclmulqdq,avx512vl")
+pg_crc32c
+pg_comp_crc32c_avx512(pg_crc32c crc, const void *data, size_t length)
+{
+ /* adjust names to match generated code */
+ pg_crc32c crc0 = crc;
+ size_t len = length;
+ const char *buf = data;
+
+ /* Align on cacheline boundary. The threshold is somewhat arbitrary. */
+ if (unlikely(len > 256))
+ {
+ for (; len && ((uintptr_t) buf & 7); --len)
+ crc0 = _mm_crc32_u8(crc0, *buf++);
+ while (((uintptr_t) buf & 56) && len >= 8)
+ {
+ crc0 = _mm_crc32_u64(crc0, *(const uint64_t *) buf);
+ buf += 8;
+ len -= 8;
+ }
+ }
+
+ if (len >= 64)
+ {
+ const char *end = buf + len;
+ const char *limit = buf + len - 64;
+ __m128i z0;
+
+ /* First vector chunk. */
+ __m512i x0 = _mm512_loadu_si512((const void *) buf),
+ y0;
+ __m512i k;
+
+ k = _mm512_broadcast_i32x4(_mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0));
+ x0 = _mm512_xor_si512(_mm512_castsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
+ buf += 64;
+
+ /* Main loop. */
+ while (buf <= limit)
+ {
+ y0 = clmul_lo(x0, k), x0 = clmul_hi(x0, k);
+ x0 = _mm512_ternarylogic_epi64(x0, y0,
+ _mm512_loadu_si512((const void *) buf),
+ 0x96);
+ buf += 64;
+ }
+
+ /* Reduce 512 bits to 128 bits. */
+ k = _mm512_setr_epi32(0x1c291d04, 0, 0xddc0152b, 0,
+ 0x3da6d0cb, 0, 0xba4fc28e, 0,
+ 0xf20c0dfe, 0, 0x493c7d27, 0,
+ 0, 0, 0, 0);
+ y0 = clmul_lo(x0, k), k = clmul_hi(x0, k);
+ y0 = _mm512_xor_si512(y0, k);
+ z0 = _mm_ternarylogic_epi64(_mm512_castsi512_si128(y0),
+ _mm512_extracti32x4_epi32(y0, 1),
+ _mm512_extracti32x4_epi32(y0, 2),
+ 0x96);
+ z0 = _mm_xor_si128(z0, _mm512_extracti32x4_epi32(x0, 3));
+
+ /* Reduce 128 bits to 32 bits, and multiply by x^32. */
+ crc0 = _mm_crc32_u64(0, _mm_extract_epi64(z0, 0));
+ crc0 = _mm_crc32_u64(crc0, _mm_extract_epi64(z0, 1));
+ len = end - buf;
+ }
+
+ return pg_comp_crc32c_sse42(crc0, buf, len);
+}
+
+#endif
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 65dbc4d4249..74d2421ba2b 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -20,30 +20,37 @@
#include "c.h"
-#ifdef HAVE__GET_CPUID
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
#include <cpuid.h>
#endif
-#ifdef HAVE__CPUID
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
#include <intrin.h>
#endif
+#ifdef HAVE_XSAVE_INTRINSICS
+#include <immintrin.h>
+#endif
+
#include "port/pg_crc32c.h"
+/*
+ * Does XGETBV say the ZMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that osxsave is available
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
static bool
-pg_crc32c_sse42_available(void)
+zmm_regs_available(void)
{
- unsigned int exx[4] = {0, 0, 0, 0};
-
-#if defined(HAVE__GET_CPUID)
- __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
-#elif defined(HAVE__CPUID)
- __cpuid(exx, 1);
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0xe6) == 0xe6;
#else
-#error cpuid instruction not available
+ return false;
#endif
-
- return (exx[2] & (1 << 20)) != 0; /* SSE 4.2 */
}
/*
@@ -53,10 +60,48 @@ pg_crc32c_sse42_available(void)
static pg_crc32c
pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
{
- if (pg_crc32c_sse42_available())
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+ /*
+ * Set fallback. We must guard since slicing-by-8 is not visible
+ * everywhere.
+ */
+#ifdef USE_SSE42_CRC32C_WITH_RUNTIME_CHECK
+ pg_comp_crc32c = pg_comp_crc32c_sb8;
+#endif
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#else
+#error cpuid instruction not available
+#endif
+
+ if ((exx[2] & (1 << 20)) != 0) /* SSE 4.2 */
+ {
pg_comp_crc32c = pg_comp_crc32c_sse42;
- else
- pg_comp_crc32c = pg_comp_crc32c_sb8;
+
+ if (exx[2] & (1 << 27) && /* OSXSAVE */
+ zmm_regs_available())
+ {
+ /* second cpuid call on leaf 7 to check extended AVX-512 support */
+
+ memset(exx, 0, 4 * sizeof(exx[0]));
+
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, 7, 0);
+#endif
+
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+ if (exx[2] & (1 << 10) && /* VPCLMULQDQ */
+ exx[1] & (1 << 31)) /* AVX512-VL */
+ pg_comp_crc32c = pg_comp_crc32c_avx512;
+#endif
+ }
+ }
return pg_comp_crc32c(crc, data, len);
}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index dc485735aa4..174f0a68331 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2330,6 +2330,30 @@ SELECT crc32c('The quick brown fox jumps over the lazy dog.');
419469235
(1 row)
+SELECT crc32c(repeat('A', 127)::bytea);
+ crc32c
+-----------
+ 291820082
+(1 row)
+
+SELECT crc32c(repeat('A', 128)::bytea);
+ crc32c
+-----------
+ 816091258
+(1 row)
+
+SELECT crc32c(repeat('A', 129)::bytea);
+ crc32c
+------------
+ 4213642571
+(1 row)
+
+SELECT crc32c(repeat('A', 800)::bytea);
+ crc32c
+------------
+ 3134039419
+(1 row)
+
--
-- encode/decode
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index aeba798dac1..f7b325baadf 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -738,6 +738,11 @@ SELECT crc32('The quick brown fox jumps over the lazy dog.');
SELECT crc32c('');
SELECT crc32c('The quick brown fox jumps over the lazy dog.');
+SELECT crc32c(repeat('A', 127)::bytea);
+SELECT crc32c(repeat('A', 128)::bytea);
+SELECT crc32c(repeat('A', 129)::bytea);
+SELECT crc32c(repeat('A', 800)::bytea);
+
--
-- encode/decode
--
--
2.48.1
v16-0002-Add-debug-for-CI-XXX-not-for-commit.patchapplication/x-patch; name=v16-0002-Add-debug-for-CI-XXX-not-for-commit.patchDownload
From 2254421337665b2b31e36d8aa85c336e2db25901 Mon Sep 17 00:00:00 2001
From: John Naylor <john.naylor@postgresql.org>
Date: Tue, 25 Mar 2025 15:19:16 +0700
Subject: [PATCH v16 2/2] Add debug for CI XXX not for commit
---
src/port/pg_crc32c_sse42.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 4d8dec004b1..e2b1246f01d 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -21,6 +21,8 @@
#include "port/pg_crc32c.h"
+#define DEBUG_CRC /* XXX not for commit, or at least comment out */
+
pg_attribute_no_sanitize_alignment()
pg_attribute_target("sse4.2")
pg_crc32c
@@ -98,6 +100,9 @@ pg_comp_crc32c_avx512(pg_crc32c crc, const void *data, size_t length)
pg_crc32c crc0 = crc;
size_t len = length;
const char *buf = data;
+#ifdef DEBUG_CRC
+ const size_t orig_len PG_USED_FOR_ASSERTS_ONLY = len;
+#endif
/* Align on cacheline boundary. The threshold is somewhat arbitrary. */
if (unlikely(len > 256))
@@ -156,7 +161,13 @@ pg_comp_crc32c_avx512(pg_crc32c crc, const void *data, size_t length)
len = end - buf;
}
- return pg_comp_crc32c_sse42(crc0, buf, len);
+ crc0 = pg_comp_crc32c_sse42(crc0, buf, len);
+
+#ifdef DEBUG_CRC
+ Assert(crc0 == pg_comp_crc32c_sse42(crc, data, orig_len));
+#endif
+
+ return crc0;
}
#endif
--
2.48.1
On Tue, Apr 01, 2025 at 05:33:02PM +0700, John Naylor wrote:
On Thu, Mar 27, 2025 at 2:55 AM Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com> wrote:
(1) zmm_regs_available() in pg_crc32c_sse42_choose.c is a duplicate of
pg_popcount_avx512.c but perhaps that’s fine for now. I will propose a
consolidated SIMD runtime check in a follow up patch.Yeah, I was thinking a small amount of duplication is tolerable.
+1. This is easy enough to change in the future if/when we add some sort
of centralized CPU feature detection.
(2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c since
they contain both sse42 and avx512 versions.The name is now not quite accurate, but it's not exactly misleading
either. I'm leaning towards keeping it the same, so for now I've just
updated the header comment.
I'm not too worried about this one either. FWIW I'm likely going to look
into moving all the x86_64 popcount stuff into pg_popcount_avx512.c and
renaming it to pg_popcount_x86_64.c for v19. This would parallel
pg_popcount_aarch64.c a bit better, and a file per architecture seems like
a logical way to neatly organize things.
For v16, I made another pass through made some more mostly superficial
adjustments:
- copied rewritten Meson comment to configure.ac
- added some more #include guards out of paranoia
- added tests with longer lengths that exercise the new paths
- adjusted configure / meson messages for consistency
- changed not-quite-accurate wording about "AVX-512 CRC instructions"
- used "PCLMUL" only when talking about specific intrinsics and prefer
"AVX-512" elsewhere, to head off potential future confusion with Arm
PMULL.
I read through the code a couple of times and nothing stood out to me.
--
nathan
On Tue, Apr 1, 2025 at 11:25 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Tue, Apr 01, 2025 at 05:33:02PM +0700, John Naylor wrote:
On Thu, Mar 27, 2025 at 2:55 AM Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com> wrote:
(2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c since
they contain both sse42 and avx512 versions.The name is now not quite accurate, but it's not exactly misleading
either. I'm leaning towards keeping it the same, so for now I've just
updated the header comment.I'm not too worried about this one either. FWIW I'm likely going to look
into moving all the x86_64 popcount stuff into pg_popcount_avx512.c and
renaming it to pg_popcount_x86_64.c for v19. This would parallel
pg_popcount_aarch64.c a bit better, and a file per architecture seems like
a logical way to neatly organize things.
Seems like a good idea.
I read through the code a couple of times and nothing stood out to me.
Thanks for looking, I plan to commit this over the weekend unless
there are objections.
--
John Naylor
Amazon Web Services
Thanks for looking, I plan to commit this over the weekend unless there are
objections.
LGTM.
Raghuveer
On Wed, Apr 02, 2025 at 02:10:40PM +0700, John Naylor wrote:
Thanks for looking, I plan to commit this over the weekend unless
there are objections.
I noticed that autoconf is defining USE_AVX512_CRC_WITH_RUNTIME_CHECK, but
everywhere else expects USE_AVX512_CRC32C_WITH_RUNTIME_CHECK (with the
"32C" included). I tested the v16 patches (with the macro fixed and
assertions enabled) on a machine with AVX-512 (verified with some extra
debug logging), and everything passed.
--
nathan
On Sat, Apr 5, 2025 at 5:15 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
I noticed that autoconf is defining USE_AVX512_CRC_WITH_RUNTIME_CHECK, but
everywhere else expects USE_AVX512_CRC32C_WITH_RUNTIME_CHECK (with the
"32C" included). I tested the v16 patches (with the macro fixed and
assertions enabled) on a machine with AVX-512 (verified with some extra
debug logging), and everything passed.
Yikes, I even made a mental note to verify the relevant object file
still had the same contents as v14, and I clearly didn't do that on
the autoconf side. Thanks for running the smoke test! I fixed that,
made a couple more tiny comment changes and pushed.
--
John Naylor
Amazon Web Services
Hi,
Recently I always get below error during initdb.
"""
UTC [1358059] FATAL: incorrect checksum in control file
"""
the command is "initdb -D tmp". git bisect show me the related commit is
3c6e8c123896584f1be1fe69aaf68dcb5eb094d5. After revert this commit on
the current master, everything is fine. Does anyone knows the reason?
The attached is my config.log.
Attachments:
config.logapplication/octet-streamDownload
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by PostgreSQL configure 18beta1, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ /u01/yizhi/github/postgres/configure --with-pgport=7432 --prefix=/u01/yizhi/bin/postgres/ --with-libxml --enable-debug --with-libxml --enable-tap-tests --enable-cassert
## --------- ##
## Platform. ##
## --------- ##
hostname = lovely-coding
uname -m = x86_64
uname -r = 5.15.0-75-generic
uname -s = Linux
uname -v = #82-Ubuntu SMP Tue Jun 6 23:10:23 UTC 2023
/usr/bin/uname -p = x86_64
/bin/uname -X = unknown
/bin/arch = x86_64
/usr/bin/arch -k = unknown
/usr/convex/getsysinfo = unknown
/usr/bin/hostinfo = unknown
/bin/machine = unknown
/usr/bin/oslevel = unknown
/bin/universe = unknown
PATH: /u01/yizhi/bin/postgres/bin/
PATH: /usr/local/sbin
PATH: /usr/local/bin
PATH: /usr/sbin
PATH: /usr/bin
PATH: /sbin
PATH: /bin
PATH: /snap/bin
PATH: /snap/bin
PATH: /home/andy/.cargo/bin
PATH: /u01/yizhi/bin/go/bin
PATH: /home/yizhi.fzh/.emacs.d/scripts
PATH: /home/yizhi.fzh/.local/bin
PATH: /home/yizhi.fzh/.emacs.d/.cache/lsp/ltex-ls/bin
## ----------- ##
## Core tests. ##
## ----------- ##
configure:2917: checking build system type
configure:2931: result: x86_64-pc-linux-gnu
configure:2951: checking host system type
configure:2964: result: x86_64-pc-linux-gnu
configure:2986: checking which template to use
configure:3052: result: linux
configure:3180: checking whether NLS is wanted
configure:3212: result: no
configure:3220: checking for default port number
configure:3245: result: 7432
configure:3699: checking for block size
configure:3733: result: 8kB
configure:3800: checking for segment size
configure:3807: result: 1GB
configure:3824: checking for WAL block size
configure:3859: result: 8kB
configure:3989: checking for C compiler version
configure:3998: clang --version >&5
clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
configure:4009: $? = 0
configure:3998: clang -v >&5
clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Found candidate GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.3.0
Selected GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.3.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
configure:4009: $? = 0
configure:3998: clang -V >&5
clang: error: argument to '-V' is missing (expected 1 value)
clang: error: no input files
configure:4009: $? = 1
configure:3998: clang -qversion >&5
clang: error: unknown argument '-qversion'; did you mean '--version'?
clang: error: no input files
configure:4009: $? = 1
configure:4029: checking whether the C compiler works
configure:4051: clang -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
configure:4055: $? = 0
configure:4103: result: yes
configure:4106: checking for C compiler default output file name
configure:4108: result: a.out
configure:4114: checking for suffix of executables
configure:4121: clang -o conftest -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
configure:4125: $? = 0
configure:4147: result:
configure:4169: checking whether we are cross compiling
configure:4177: clang -o conftest -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
configure:4181: $? = 0
configure:4188: ./conftest
configure:4192: $? = 0
configure:4180: result: no
configure:4185: checking for suffix of object files
configure:4207: clang -c -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
configure:4211: $? = 0
configure:4232: result: o
configure:4236: checking whether we are using the GNU C compiler
configure:4255: clang -c -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
configure:4255: $? = 0
configure:4264: result: yes
configure:4273: checking whether clang accepts -g
configure:4293: clang -c -g conftest.c >&5
configure:4293: $? = 0
configure:4334: result: yes
configure:4351: checking for clang option to accept ISO C89
configure:4414: clang -c -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
conftest.c:25:14: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C23 [-Wdeprecated-non-prototype]
25 | static char *e (p, i)
| ^
1 warning generated.
configure:4414: $? = 0
configure:4427: result: none needed
configure:4447: checking for clang option to accept ISO C99
configure:4596: clang -c -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
conftest.c:88:15: warning: variable 'str' set but not used [-Wunused-but-set-variable]
88 | const char *str;
| ^
conftest.c:89:7: warning: variable 'number' set but not used [-Wunused-but-set-variable]
89 | int number;
| ^
conftest.c:90:9: warning: variable 'fnumber' set but not used [-Wunused-but-set-variable]
90 | float fnumber;
| ^
3 warnings generated.
configure:4596: $? = 0
configure:4609: result: none needed
configure:4743: checking for C++ compiler version
configure:4752: clang++ --version >&5
clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
configure:4763: $? = 0
configure:4752: clang++ -v >&5
clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Found candidate GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.3.0
Selected GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.3.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
configure:4763: $? = 0
configure:4752: clang++ -V >&5
clang++: error: argument to '-V' is missing (expected 1 value)
clang++: error: no input files
configure:4763: $? = 1
configure:4752: clang++ -qversion >&5
clang++: error: unknown argument '-qversion'; did you mean '--version'?
clang++: error: no input files
configure:4763: $? = 1
configure:4767: checking whether we are using the GNU C++ compiler
configure:4786: clang++ -c conftest.cpp >&5
configure:4786: $? = 0
configure:4795: result: yes
configure:4804: checking whether clang++ accepts -g
configure:4824: clang++ -c -g conftest.cpp >&5
configure:4824: $? = 0
configure:4865: result: yes
configure:4906: clang -c -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
conftest.c:24:1: error: use of undeclared identifier 'choke'
24 | choke me
| ^
1 error generated.
configure:4906: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| /* end confdefs.h. */
|
| int
| main ()
| {
| #ifndef __INTEL_COMPILER
| choke me
| #endif
| ;
| return 0;
| }
configure:4928: clang -c -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds conftest.c >&5
conftest.c:24:1: error: use of undeclared identifier 'choke'
24 | choke me
| ^
1 error generated.
configure:4928: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| /* end confdefs.h. */
|
| int
| main ()
| {
| #ifndef __SUNPRO_C
| choke me
| #endif
| ;
| return 0;
| }
configure:4973: checking for gawk
configure:5003: result: no
configure:4973: checking for mawk
configure:4989: found /usr/bin/mawk
configure:5000: result: mawk
configure:5306: checking whether clang supports -Wdeclaration-after-statement, for CFLAGS
configure:5328: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -D_GNU_SOURCE conftest.c >&5
configure:5328: $? = 0
configure:5338: result: yes
configure:5354: checking whether clang supports -Werror=vla, for CFLAGS
configure:5376: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -D_GNU_SOURCE conftest.c >&5
configure:5376: $? = 0
configure:5386: result: yes
configure:5395: checking whether clang supports -Werror=unguarded-availability-new, for CFLAGS
configure:5417: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -D_GNU_SOURCE conftest.c >&5
configure:5417: $? = 0
configure:5427: result: yes
configure:5434: checking whether clang++ supports -Werror=unguarded-availability-new, for CXXFLAGS
configure:5462: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -D_GNU_SOURCE conftest.cpp >&5
configure:5462: $? = 0
configure:5478: result: yes
configure:5487: checking whether clang supports -Wendif-labels, for CFLAGS
configure:5509: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -D_GNU_SOURCE conftest.c >&5
configure:5509: $? = 0
configure:5519: result: yes
configure:5526: checking whether clang++ supports -Wendif-labels, for CXXFLAGS
configure:5554: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -D_GNU_SOURCE conftest.cpp >&5
configure:5554: $? = 0
configure:5570: result: yes
configure:5578: checking whether clang supports -Wmissing-format-attribute, for CFLAGS
configure:5600: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -D_GNU_SOURCE conftest.c >&5
configure:5600: $? = 0
configure:5610: result: yes
configure:5617: checking whether clang++ supports -Wmissing-format-attribute, for CXXFLAGS
configure:5645: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -D_GNU_SOURCE conftest.cpp >&5
configure:5645: $? = 0
configure:5661: result: yes
configure:5669: checking whether clang supports -Wimplicit-fallthrough=3, for CFLAGS
configure:5691: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -D_GNU_SOURCE conftest.c >&5
warning: unknown warning option '-Wimplicit-fallthrough=3'; did you mean '-Wimplicit-fallthrough'? [-Wunknown-warning-option]
1 warning generated.
configure:5691: $? = 0
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| /* end confdefs.h. */
|
| int
| main ()
| {
|
| ;
| return 0;
| }
configure:5701: result: no
configure:5708: checking whether clang++ supports -Wimplicit-fallthrough=3, for CXXFLAGS
configure:5736: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -D_GNU_SOURCE conftest.cpp >&5
warning: unknown warning option '-Wimplicit-fallthrough=3'; did you mean '-Wimplicit-fallthrough'? [-Wunknown-warning-option]
1 warning generated.
configure:5736: $? = 0
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| /* end confdefs.h. */
|
| int
| main ()
| {
|
| ;
| return 0;
| }
configure:5752: result: no
configure:5760: checking whether clang supports -Wcast-function-type, for CFLAGS
configure:5782: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -D_GNU_SOURCE conftest.c >&5
configure:5782: $? = 0
configure:5792: result: yes
configure:5799: checking whether clang++ supports -Wcast-function-type, for CXXFLAGS
configure:5827: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -D_GNU_SOURCE conftest.cpp >&5
configure:5827: $? = 0
configure:5843: result: yes
configure:5851: checking whether clang supports -Wshadow=compatible-local, for CFLAGS
configure:5873: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wshadow=compatible-local -D_GNU_SOURCE conftest.c >&5
warning: unknown warning option '-Wshadow=compatible-local'; did you mean '-Wshadow-uncaptured-local'? [-Wunknown-warning-option]
1 warning generated.
configure:5873: $? = 0
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| /* end confdefs.h. */
|
| int
| main ()
| {
|
| ;
| return 0;
| }
configure:5883: result: no
configure:5890: checking whether clang++ supports -Wshadow=compatible-local, for CXXFLAGS
configure:5918: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wshadow=compatible-local -D_GNU_SOURCE conftest.cpp >&5
warning: unknown warning option '-Wshadow=compatible-local'; did you mean '-Wshadow-uncaptured-local'? [-Wunknown-warning-option]
1 warning generated.
configure:5918: $? = 0
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| /* end confdefs.h. */
|
| int
| main ()
| {
|
| ;
| return 0;
| }
configure:5934: result: no
configure:5943: checking whether clang supports -Wformat-security, for CFLAGS
configure:5965: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -D_GNU_SOURCE conftest.c >&5
configure:5965: $? = 0
configure:5975: result: yes
configure:5982: checking whether clang++ supports -Wformat-security, for CXXFLAGS
configure:6010: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -D_GNU_SOURCE conftest.cpp >&5
configure:6010: $? = 0
configure:6026: result: yes
configure:6037: checking whether clang supports -Wmissing-variable-declarations, for CFLAGS
configure:6059: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -D_GNU_SOURCE conftest.c >&5
configure:6059: $? = 0
configure:6069: result: yes
configure:6083: checking whether clang supports -fno-strict-aliasing, for CFLAGS
configure:6105: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -D_GNU_SOURCE conftest.c >&5
configure:6105: $? = 0
configure:6115: result: yes
configure:6122: checking whether clang++ supports -fno-strict-aliasing, for CXXFLAGS
configure:6150: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -fno-strict-aliasing -D_GNU_SOURCE conftest.cpp >&5
configure:6150: $? = 0
configure:6166: result: yes
configure:6175: checking whether clang supports -fwrapv, for CFLAGS
configure:6197: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -D_GNU_SOURCE conftest.c >&5
configure:6197: $? = 0
configure:6207: result: yes
configure:6214: checking whether clang++ supports -fwrapv, for CXXFLAGS
configure:6242: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -D_GNU_SOURCE conftest.cpp >&5
configure:6242: $? = 0
configure:6258: result: yes
configure:6267: checking whether clang supports -fexcess-precision=standard, for CFLAGS
configure:6289: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -D_GNU_SOURCE conftest.c >&5
configure:6289: $? = 0
configure:6299: result: yes
configure:6306: checking whether clang++ supports -fexcess-precision=standard, for CXXFLAGS
configure:6334: clang++ -c -Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -D_GNU_SOURCE conftest.cpp >&5
configure:6334: $? = 0
configure:6350: result: yes
configure:6358: checking whether clang supports -funroll-loops, for CFLAGS_UNROLL_LOOPS
configure:6380: clang -c -funroll-loops -D_GNU_SOURCE conftest.c >&5
configure:6380: $? = 0
configure:6390: result: yes
configure:6398: checking whether clang supports -ftree-vectorize, for CFLAGS_VECTORIZE
configure:6420: clang -c -ftree-vectorize -D_GNU_SOURCE conftest.c >&5
configure:6420: $? = 0
configure:6430: result: yes
configure:6446: checking whether clang supports -Wunused-command-line-argument, for NOT_THE_CFLAGS
configure:6468: clang -c -Wunused-command-line-argument -D_GNU_SOURCE conftest.c >&5
configure:6468: $? = 0
configure:6478: result: yes
configure:6491: checking whether clang supports -Wcompound-token-split-by-macro, for NOT_THE_CFLAGS
configure:6513: clang -c -Wcompound-token-split-by-macro -D_GNU_SOURCE conftest.c >&5
configure:6513: $? = 0
configure:6523: result: yes
configure:6535: checking whether clang supports -Wformat-truncation, for NOT_THE_CFLAGS
configure:6557: clang -c -Wformat-truncation -D_GNU_SOURCE conftest.c >&5
configure:6557: $? = 0
configure:6567: result: yes
configure:6578: checking whether clang supports -Wstringop-truncation, for NOT_THE_CFLAGS
configure:6600: clang -c -Wstringop-truncation -D_GNU_SOURCE conftest.c >&5
warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wformat-truncation'? [-Wunknown-warning-option]
1 warning generated.
configure:6600: $? = 0
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| /* end confdefs.h. */
|
| int
| main ()
| {
|
| ;
| return 0;
| }
configure:6610: result: no
configure:6622: checking whether clang supports -Wcast-function-type-strict, for NOT_THE_CFLAGS
configure:6644: clang -c -Wcast-function-type-strict -D_GNU_SOURCE conftest.c >&5
configure:6644: $? = 0
configure:6654: result: yes
configure:6863: checking whether clang supports -fvisibility=hidden, for CFLAGS_SL_MODULE
configure:6885: clang -c -fvisibility=hidden -D_GNU_SOURCE conftest.c >&5
configure:6885: $? = 0
configure:6895: result: yes
configure:6903: checking whether clang++ supports -fvisibility=hidden, for CXXFLAGS_SL_MODULE
configure:6931: clang++ -c -fvisibility=hidden -D_GNU_SOURCE conftest.cpp >&5
configure:6931: $? = 0
configure:6947: result: yes
configure:6953: checking whether clang++ supports -fvisibility-inlines-hidden, for CXXFLAGS_SL_MODULE
configure:6981: clang++ -c -fvisibility=hidden -fvisibility-inlines-hidden -D_GNU_SOURCE conftest.cpp >&5
configure:6981: $? = 0
configure:6997: result: yes
configure:7625: checking whether the C compiler still works
configure:7638: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE conftest.c >&5
configure:7638: $? = 0
configure:7639: result: yes
configure:7664: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE conftest.c >&5
configure:7664: $? = 0
configure:7710: checking how to run the C preprocessor
configure:7741: clang -E -D_GNU_SOURCE conftest.c
configure:7741: $? = 0
configure:7755: clang -E -D_GNU_SOURCE conftest.c
conftest.c:20:10: fatal error: 'ac_nonexistent.h' file not found
20 | #include <ac_nonexistent.h>
| ^~~~~~~~~~~~~~~~~~
1 error generated.
configure:7755: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| /* end confdefs.h. */
| #include <ac_nonexistent.h>
configure:7780: result: clang -E
configure:7800: clang -E -D_GNU_SOURCE conftest.c
configure:7800: $? = 0
configure:7814: clang -E -D_GNU_SOURCE conftest.c
conftest.c:20:10: fatal error: 'ac_nonexistent.h' file not found
20 | #include <ac_nonexistent.h>
| ^~~~~~~~~~~~~~~~~~
1 error generated.
configure:7814: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| /* end confdefs.h. */
| #include <ac_nonexistent.h>
configure:7922: checking for pkg-config
configure:7940: found /usr/bin/pkg-config
configure:7952: result: /usr/bin/pkg-config
configure:7977: checking pkg-config is at least version 0.9.0
configure:7980: result: yes
configure:8084: checking whether to build with ICU support
configure:8114: result: yes
configure:8121: checking for icu-uc icu-i18n
configure:8128: $PKG_CONFIG --exists --print-errors "icu-uc icu-i18n"
configure:8131: $? = 0
configure:8145: $PKG_CONFIG --exists --print-errors "icu-uc icu-i18n"
configure:8148: $? = 0
configure:8206: result: yes
configure:8215: checking whether to build with Tcl
configure:8241: result: no
configure:8273: checking whether to build Perl modules
configure:8299: result: no
configure:8306: checking whether to build Python modules
configure:8332: result: no
configure:8339: checking whether to build with GSSAPI support
configure:8370: result: no
configure:8415: checking whether to build with PAM support
configure:8443: result: no
configure:8450: checking whether to build with BSD Authentication support
configure:8478: result: no
configure:8485: checking whether to build with LDAP support
configure:8513: result: no
configure:8521: checking whether to build with Bonjour support
configure:8549: result: no
configure:8556: checking whether to build with SELinux support
configure:8583: result: no
configure:8589: checking whether to build with systemd support
configure:8618: result: no
configure:8692: checking whether to build with liburing support
configure:8720: result: no
configure:8901: checking whether to build with libcurl support
configure:8929: result: no
configure:9060: checking whether to build with libnuma support
configure:9088: result: no
configure:9236: checking whether to build with XML support
configure:9264: result: yes
configure:9275: checking for libxml-2.0 >= 2.6.23
configure:9282: $PKG_CONFIG --exists --print-errors "libxml-2.0 >= 2.6.23"
configure:9285: $? = 0
configure:9299: $PKG_CONFIG --exists --print-errors "libxml-2.0 >= 2.6.23"
configure:9302: $? = 0
configure:9340: result: yes
configure:9510: checking whether to build with LZ4 support
configure:9538: result: no
configure:9651: checking whether to build with ZSTD support
configure:9679: result: no
configure:9842: checking for strip
configure:9858: found /usr/bin/strip
configure:9869: result: strip
configure:9892: checking whether it is possible to strip libraries
configure:9897: result: yes
configure:9962: checking for ar
configure:9978: found /usr/bin/ar
configure:9989: result: ar
configure:10120: checking for a BSD-compatible install
configure:10188: result: /usr/bin/install -c
configure:10213: checking for tar
configure:10231: found /usr/bin/tar
configure:10243: result: /usr/bin/tar
configure:10262: checking whether ln -s works
configure:10266: result: yes
configure:10273: checking for a thread-safe mkdir -p
configure:10312: result: /usr/bin/mkdir -p
configure:10327: checking for bison
configure:10345: found /usr/bin/bison
configure:10357: result: /usr/bin/bison
configure:10379: using bison (GNU Bison) 3.5.1
configure:10407: checking for flex
configure:10425: found /usr/bin/flex
configure:10437: result: /usr/bin/flex
configure:10461: using flex 2.6.4
configure:10473: checking for perl
configure:10491: found /usr/bin/perl
configure:10503: result: /usr/bin/perl
configure:10525: using perl 5.30.0
configure:10883: checking for a sed that does not truncate output
configure:10947: result: /usr/bin/sed
configure:10953: checking for grep that handles long lines and -e
configure:11011: result: /usr/bin/grep
configure:11016: checking for egrep
configure:11078: result: /usr/bin/grep -E
configure:11083: checking for ANSI C header files
configure:11103: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11103: $? = 0
configure:11176: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11176: $? = 0
configure:11176: ./conftest
configure:11176: $? = 0
configure:11187: result: yes
configure:11200: checking for sys/types.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for sys/stat.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for stdlib.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for string.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for memory.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for strings.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for inttypes.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for stdint.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11200: checking for unistd.h
configure:11200: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11200: $? = 0
configure:11200: result: yes
configure:11397: checking whether clang is Clang
configure:11422: result: yes
configure:11469: checking whether Clang needs flag to prevent "argument unused" warning when linking with -pthread
configure:11494: clang -o conftest -Werror -Wunknown-warning-option -pthread -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11494: $? = 0
configure:11500: (clang -c -Werror -Wunknown-warning-option -pthread -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5) && (echo ==== >&5) && (clang -o conftest -Werror -Wunknown-warning-option -pthread -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.o >&5)
====
configure:11500: $? = 0
configure:11518: result: no
configure:11667: checking for joinable pthread attribute
configure:11685: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -pthread -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11685: $? = 0
configure:11693: result: PTHREAD_CREATE_JOINABLE
configure:11707: checking whether more special flags are required for pthreads
configure:11720: result: no
configure:11728: checking for PTHREAD_PRIO_INHERIT
configure:11744: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -pthread -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:39:5: warning: unused variable 'i' [-Wunused-variable]
39 | int i = PTHREAD_PRIO_INHERIT;
| ^
1 warning generated.
configure:11744: $? = 0
configure:11753: result: yes
configure:11866: checking pthread.h usability
configure:11866: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -pthread -D_REENTRANT -D_THREAD_SAFE -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11866: $? = 0
configure:11866: result: yes
configure:11866: checking pthread.h presence
configure:11866: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:11866: $? = 0
configure:11866: result: yes
configure:11866: checking for pthread.h
configure:11866: result: yes
configure:11878: checking for strerror_r
configure:11878: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -pthread -D_REENTRANT -D_THREAD_SAFE -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:11878: $? = 0
configure:11878: result: yes
configure:11889: checking whether strerror_r returns int
configure:11908: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -pthread -D_REENTRANT -D_THREAD_SAFE -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:43:3: error: statement requires expression of integer type ('char *' invalid)
43 | switch (strerror_r(1, buf, sizeof(buf)))
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
configure:11908: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| /* end confdefs.h. */
| #include <string.h>
| int
| main ()
| {
| char buf[100];
| switch (strerror_r(1, buf, sizeof(buf)))
| { case 0: break; default: break; }
|
| ;
| return 0;
| }
configure:11915: result: no
configure:11949: checking for main in -lm
configure:11968: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lm >&5
conftest.c:42:1: warning: all paths through this function will call itself [-Winfinite-recursion]
42 | {
| ^
1 warning generated.
configure:11968: $? = 0
configure:11977: result: yes
configure:11988: checking for library containing setproctitle
configure:12019: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-95081a.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `setproctitle'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:12019: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| /* end confdefs.h. */
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char setproctitle ();
| int
| main ()
| {
| return setproctitle ();
| ;
| return 0;
| }
configure:12019: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lutil -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-0b3cfb.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `setproctitle'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:12019: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| /* end confdefs.h. */
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char setproctitle ();
| int
| main ()
| {
| return setproctitle ();
| ;
| return 0;
| }
configure:12036: result: no
configure:12047: checking for library containing dlsym
configure:12078: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-20110f.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `dlsym'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:12078: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| /* end confdefs.h. */
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char dlsym ();
| int
| main ()
| {
| return dlsym ();
| ;
| return 0;
| }
configure:12078: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -ldl -lm >&5
configure:12078: $? = 0
configure:12095: result: -ldl
configure:12103: checking for library containing socket
configure:12134: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -ldl -lm >&5
configure:12134: $? = 0
configure:12151: result: none required
configure:12159: checking for library containing getopt_long
configure:12190: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -ldl -lm >&5
configure:12190: $? = 0
configure:12207: result: none required
configure:12215: checking for library containing shm_open
configure:12246: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-5fbcfa.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `shm_open'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:12246: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| /* end confdefs.h. */
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char shm_open ();
| int
| main ()
| {
| return shm_open ();
| ;
| return 0;
| }
configure:12246: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lrt -ldl -lm >&5
configure:12246: $? = 0
configure:12263: result: -lrt
configure:12271: checking for library containing shm_unlink
configure:12302: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lrt -ldl -lm >&5
configure:12302: $? = 0
configure:12319: result: none required
configure:12327: checking for library containing clock_gettime
configure:12358: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lrt -ldl -lm >&5
configure:12358: $? = 0
configure:12375: result: none required
configure:12384: checking for library containing shmget
configure:12415: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lrt -ldl -lm >&5
configure:12415: $? = 0
configure:12432: result: none required
configure:12441: checking for library containing backtrace_symbols
configure:12472: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lrt -ldl -lm >&5
configure:12472: $? = 0
configure:12489: result: none required
configure:12498: checking for library containing pthread_barrier_wait
configure:12529: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lrt -ldl -lm >&5
/usr/bin/ld: /tmp/conftest-5ec22c.o: undefined reference to symbol 'pthread_barrier_wait@@GLIBC_2.2.5'
/usr/bin/ld: /lib/x86_64-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:12529: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| /* end confdefs.h. */
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char pthread_barrier_wait ();
| int
| main ()
| {
| return pthread_barrier_wait ();
| ;
| return 0;
| }
configure:12529: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lpthread -lrt -ldl -lm >&5
configure:12529: $? = 0
configure:12546: result: -lpthread
configure:12558: checking for library containing readline
configure:12590: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lreadline -lpthread -lrt -ldl -lm >&5
configure:12590: $? = 0
configure:12616: result: -lreadline
configure:12635: checking for inflate in -lz
configure:12660: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:12660: $? = 0
configure:12669: result: yes
configure:13182: checking for xmlSaveToBuffer in -lxml2
configure:13207: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:13207: $? = 0
configure:13216: result: yes
configure:13767: checking atomic.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:75:10: fatal error: 'atomic.h' file not found
75 | #include <atomic.h>
| ^~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <atomic.h>
configure:13767: result: no
configure:13767: checking atomic.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:42:10: fatal error: 'atomic.h' file not found
42 | #include <atomic.h>
| ^~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| /* end confdefs.h. */
| #include <atomic.h>
configure:13767: result: no
configure:13767: checking for atomic.h
configure:13767: result: no
configure:13767: checking copyfile.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:75:10: fatal error: 'copyfile.h' file not found
75 | #include <copyfile.h>
| ^~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <copyfile.h>
configure:13767: result: no
configure:13767: checking copyfile.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:42:10: fatal error: 'copyfile.h' file not found
42 | #include <copyfile.h>
| ^~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| /* end confdefs.h. */
| #include <copyfile.h>
configure:13767: result: no
configure:13767: checking for copyfile.h
configure:13767: result: no
configure:13767: checking execinfo.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking execinfo.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for execinfo.h
configure:13767: result: yes
configure:13767: checking getopt.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking getopt.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for getopt.h
configure:13767: result: yes
configure:13767: checking ifaddrs.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking ifaddrs.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for ifaddrs.h
configure:13767: result: yes
configure:13767: checking mbarrier.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:78:10: fatal error: 'mbarrier.h' file not found
78 | #include <mbarrier.h>
| ^~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <mbarrier.h>
configure:13767: result: no
configure:13767: checking mbarrier.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:45:10: fatal error: 'mbarrier.h' file not found
45 | #include <mbarrier.h>
| ^~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| /* end confdefs.h. */
| #include <mbarrier.h>
configure:13767: result: no
configure:13767: checking for mbarrier.h
configure:13767: result: no
configure:13767: checking sys/epoll.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking sys/epoll.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for sys/epoll.h
configure:13767: result: yes
configure:13767: checking sys/event.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:79:10: fatal error: 'sys/event.h' file not found
79 | #include <sys/event.h>
| ^~~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <sys/event.h>
configure:13767: result: no
configure:13767: checking sys/event.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:46:10: fatal error: 'sys/event.h' file not found
46 | #include <sys/event.h>
| ^~~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| /* end confdefs.h. */
| #include <sys/event.h>
configure:13767: result: no
configure:13767: checking for sys/event.h
configure:13767: result: no
configure:13767: checking sys/personality.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking sys/personality.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for sys/personality.h
configure:13767: result: yes
configure:13767: checking sys/prctl.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking sys/prctl.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for sys/prctl.h
configure:13767: result: yes
configure:13767: checking sys/procctl.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:81:10: fatal error: 'sys/procctl.h' file not found
81 | #include <sys/procctl.h>
| ^~~~~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <sys/procctl.h>
configure:13767: result: no
configure:13767: checking sys/procctl.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:48:10: fatal error: 'sys/procctl.h' file not found
48 | #include <sys/procctl.h>
| ^~~~~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| /* end confdefs.h. */
| #include <sys/procctl.h>
configure:13767: result: no
configure:13767: checking for sys/procctl.h
configure:13767: result: no
configure:13767: checking sys/signalfd.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking sys/signalfd.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for sys/signalfd.h
configure:13767: result: yes
configure:13767: checking sys/ucred.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:82:10: fatal error: 'sys/ucred.h' file not found
82 | #include <sys/ucred.h>
| ^~~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <sys/ucred.h>
configure:13767: result: no
configure:13767: checking sys/ucred.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:49:10: fatal error: 'sys/ucred.h' file not found
49 | #include <sys/ucred.h>
| ^~~~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| /* end confdefs.h. */
| #include <sys/ucred.h>
configure:13767: result: no
configure:13767: checking for sys/ucred.h
configure:13767: result: no
configure:13767: checking termios.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking termios.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13767: $? = 0
configure:13767: result: yes
configure:13767: checking for termios.h
configure:13767: result: yes
configure:13767: checking ucred.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:83:10: fatal error: 'ucred.h' file not found
83 | #include <ucred.h>
| ^~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <ucred.h>
configure:13767: result: no
configure:13767: checking ucred.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:50:10: fatal error: 'ucred.h' file not found
50 | #include <ucred.h>
| ^~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| /* end confdefs.h. */
| #include <ucred.h>
configure:13767: result: no
configure:13767: checking for ucred.h
configure:13767: result: no
configure:13767: checking xlocale.h usability
configure:13767: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:83:10: fatal error: 'xlocale.h' file not found
83 | #include <xlocale.h>
| ^~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| #include <xlocale.h>
configure:13767: result: no
configure:13767: checking xlocale.h presence
configure:13767: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
conftest.c:50:10: fatal error: 'xlocale.h' file not found
50 | #include <xlocale.h>
| ^~~~~~~~~~~
1 error generated.
configure:13767: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| /* end confdefs.h. */
| #include <xlocale.h>
configure:13767: result: no
configure:13767: checking for xlocale.h
configure:13767: result: no
configure:13781: checking readline/readline.h usability
configure:13781: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13781: $? = 0
configure:13781: result: yes
configure:13781: checking readline/readline.h presence
configure:13781: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13781: $? = 0
configure:13781: result: yes
configure:13781: checking for readline/readline.h
configure:13781: result: yes
configure:13811: checking readline/history.h usability
configure:13811: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13811: $? = 0
configure:13811: result: yes
configure:13811: checking readline/history.h presence
configure:13811: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13811: $? = 0
configure:13811: result: yes
configure:13811: checking for readline/history.h
configure:13811: result: yes
configure:13933: checking zlib.h usability
configure:13933: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:13933: $? = 0
configure:13933: result: yes
configure:13933: checking zlib.h presence
configure:13933: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:13933: $? = 0
configure:13933: result: yes
configure:13933: checking for zlib.h
configure:13933: result: yes
configure:13951: checking for lz4
configure:13984: result: no
configure:14016: checking for zstd
configure:14034: found /usr/bin/zstd
configure:14046: result: /usr/bin/zstd
configure:14138: checking for openssl
configure:14156: found /usr/bin/openssl
configure:14168: result: /usr/bin/openssl
configure:14188: using openssl: OpenSSL 1.1.1f 31 Mar 2020
configure:14262: checking libxml/parser.h usability
configure:14262: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:14262: $? = 0
configure:14262: result: yes
configure:14262: checking libxml/parser.h presence
configure:14262: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:14262: $? = 0
configure:14262: result: yes
configure:14262: checking for libxml/parser.h
configure:14262: result: yes
configure:14520: checking whether byte ordering is bigendian
configure:14535: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:53:9: error: unknown type name 'not'
53 | not a universal capable compiler
| ^
conftest.c:53:14: error: expected ';' after top level declarator
53 | not a universal capable compiler
| ^
| ;
2 errors generated.
configure:14535: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| /* end confdefs.h. */
| #ifndef __APPLE_CC__
| not a universal capable compiler
| #endif
| typedef int dummy;
|
configure:14580: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:14580: $? = 0
configure:14598: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:59:4: error: use of undeclared identifier 'not'
59 | not big endian
| ^
1 error generated.
configure:14598: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| /* end confdefs.h. */
| #include <sys/types.h>
| #include <sys/param.h>
|
| int
| main ()
| {
| #if BYTE_ORDER != BIG_ENDIAN
| not big endian
| #endif
|
| ;
| return 0;
| }
configure:14726: result: no
configure:14744: checking for inline
configure:14760: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:54:21: warning: unused function 'static_foo' [-Wunused-function]
54 | static inline foo_t static_foo () {return 0; }
| ^~~~~~~~~~
1 warning generated.
configure:14760: $? = 0
configure:14768: result: inline
configure:14786: checking for printf format archetype
configure:14806: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:53:16: warning: 'format' attribute argument not supported: gnu_printf [-Wignored-attributes]
53 | __attribute__((format(gnu_printf, 2, 3)));
| ^
1 warning generated.
configure:14806: $? = 0
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| /* end confdefs.h. */
| extern void pgac_write(int ignore, const char *fmt,...)
| __attribute__((format(gnu_printf, 2, 3)));
| int
| main ()
| {
| pgac_write(0, "error %s: %m", "foo");
| ;
| return 0;
| }
configure:14830: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:53:16: warning: 'format' attribute argument not supported: syslog [-Wignored-attributes]
53 | __attribute__((format(__syslog__, 2, 3)));
| ^
1 warning generated.
configure:14830: $? = 0
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| /* end confdefs.h. */
| extern void pgac_write(int ignore, const char *fmt,...)
| __attribute__((format(__syslog__, 2, 3)));
| int
| main ()
| {
| pgac_write(0, "error %s: %m", "foo");
| ;
| return 0;
| }
configure:14843: result: printf
configure:14851: checking for _Static_assert
configure:14867: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:14867: $? = 0
configure:14875: result: yes
configure:14882: checking for typeof
configure:14903: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:59:1: error: call to undeclared function 'typeof'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
59 | typeof(x) y;
| ^
conftest.c:59:10: error: expected ';' after expression
59 | typeof(x) y;
| ^
| ;
conftest.c:59:11: error: use of undeclared identifier 'y'
59 | typeof(x) y;
| ^
conftest.c:60:1: error: use of undeclared identifier 'y'
60 | y = x;
| ^
conftest.c:61:8: error: use of undeclared identifier 'y'
61 | return y;
| ^
5 errors generated.
configure:14903: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| /* end confdefs.h. */
|
| int
| main ()
| {
| int x = 0;
| typeof(x) y;
| y = x;
| return y;
| ;
| return 0;
| }
configure:14903: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:14903: $? = 0
configure:14910: result: __typeof__
configure:14924: checking for __builtin_types_compatible_p
configure:14940: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:60:20: warning: unused variable 'y' [-Wunused-variable]
60 | int x; static int y[__builtin_types_compatible_p(__typeof__(x), int)];
| ^
1 warning generated.
configure:14940: $? = 0
configure:14947: result: yes
configure:14954: checking for __builtin_constant_p
configure:14967: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:58:14: warning: unused variable 'y' [-Wunused-variable]
58 | static int y[__builtin_constant_p(x) ? x : 1];
| ^
conftest.c:59:14: warning: unused variable 'z' [-Wunused-variable]
59 | static int z[__builtin_constant_p("string literal") ? 1 : x];
| ^
2 warnings generated.
configure:14967: $? = 0
configure:14974: result: yes
configure:14981: checking for __builtin_mul_overflow
configure:15003: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:60:9: warning: no previous extern declaration for non-static variable 'a' [-Wmissing-variable-declarations]
60 | int64_t a = 1;
| ^
conftest.c:60:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
60 | int64_t a = 1;
| ^
conftest.c:61:9: warning: no previous extern declaration for non-static variable 'b' [-Wmissing-variable-declarations]
61 | int64_t b = 1;
| ^
conftest.c:61:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
61 | int64_t b = 1;
| ^
conftest.c:62:9: warning: no previous extern declaration for non-static variable 'result' [-Wmissing-variable-declarations]
62 | int64_t result;
| ^
conftest.c:62:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
62 | int64_t result;
| ^
conftest.c:63:5: warning: no previous extern declaration for non-static variable 'oflo' [-Wmissing-variable-declarations]
63 | int oflo;
| ^
conftest.c:63:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
63 | int oflo;
| ^
4 warnings generated.
configure:15003: $? = 0
configure:15011: result: yes
configure:15018: checking for __builtin_unreachable
configure:15034: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:15034: $? = 0
configure:15042: result: yes
configure:15049: checking for computed goto support
configure:15069: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:15069: $? = 0
configure:15076: result: yes
configure:15083: checking for struct tm.tm_zone
configure:15083: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:15083: $? = 0
configure:15083: result: yes
configure:15097: checking for union semun
configure:15097: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:70:5: error: invalid application of 'sizeof' to an incomplete type 'union semun'
70 | if (sizeof (union semun))
| ^ ~~~~~~~~~~~~~
conftest.c:70:19: note: forward declaration of 'union semun'
70 | if (sizeof (union semun))
| ^
1 error generated.
configure:15097: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| /* end confdefs.h. */
| #include <sys/types.h>
| #include <sys/ipc.h>
| #include <sys/sem.h>
|
|
| int
| main ()
| {
| if (sizeof (union semun))
| return 0;
| ;
| return 0;
| }
configure:15097: result: no
configure:15111: checking for socklen_t
configure:15111: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:15111: $? = 0
configure:15111: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:67:24: error: expected expression
67 | if (sizeof ((socklen_t)))
| ^
1 error generated.
configure:15111: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| /* end confdefs.h. */
| #include <sys/socket.h>
|
| int
| main ()
| {
| if (sizeof ((socklen_t)))
| return 0;
| ;
| return 0;
| }
configure:15111: result: yes
configure:15122: checking for struct sockaddr.sa_len
configure:15122: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:71:13: error: no member named 'sa_len' in 'struct sockaddr'
71 | if (ac_aggr.sa_len)
| ~~~~~~~ ^
1 error generated.
configure:15122: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| /* end confdefs.h. */
| #include <sys/types.h>
| #include <sys/socket.h>
|
|
| int
| main ()
| {
| static struct sockaddr ac_aggr;
| if (ac_aggr.sa_len)
| return 0;
| ;
| return 0;
| }
configure:15122: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:71:20: error: no member named 'sa_len' in 'struct sockaddr'
71 | if (sizeof ac_aggr.sa_len)
| ~~~~~~~ ^
1 error generated.
configure:15122: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| /* end confdefs.h. */
| #include <sys/types.h>
| #include <sys/socket.h>
|
|
| int
| main ()
| {
| static struct sockaddr ac_aggr;
| if (sizeof ac_aggr.sa_len)
| return 0;
| ;
| return 0;
| }
configure:15122: result: no
configure:15140: checking for C/C++ restrict keyword
configure:15165: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:64:6: warning: no previous prototype for function 'foo' [-Wmissing-prototypes]
64 | int foo (int_ptr __restrict ip) {
| ^
conftest.c:64:2: note: declare 'static' if the function is not intended to be used outside of this translation unit
64 | int foo (int_ptr __restrict ip) {
| ^
| static
1 warning generated.
configure:15165: $? = 0
configure:15173: result: __restrict
configure:15197: checking for struct option
configure:15197: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:15197: $? = 0
configure:15197: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:72:28: error: expected expression
72 | if (sizeof ((struct option)))
| ^
1 error generated.
configure:15197: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| /* end confdefs.h. */
| #ifdef HAVE_GETOPT_H
| #include <getopt.h>
| #endif
|
| int
| main ()
| {
| if (sizeof ((struct option)))
| return 0;
| ;
| return 0;
| }
configure:15197: result: yes
configure:15214: checking whether assembler supports x86_64 popcntq
configure:15231: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:15231: $? = 0
configure:15238: result: yes
configure:15302: checking for special C compiler options needed for large files
configure:15347: result: no
configure:15353: checking for _FILE_OFFSET_BITS value needed for large files
configure:15378: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:73:7: warning: no previous extern declaration for non-static variable 'off_t_is_large' [-Wmissing-variable-declarations]
73 | int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
| ^
conftest.c:73:3: note: declare 'static' if the variable is not intended to be used outside of this translation unit
73 | int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
| ^
1 warning generated.
configure:15378: $? = 0
configure:15410: result: no
configure:15502: checking size of off_t
configure:15507: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:15507: $? = 0
configure:15507: ./conftest
configure:15507: $? = 0
configure:15521: result: 8
configure:15544: checking for int timezone
configure:15565: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:69:5: warning: no previous extern declaration for non-static variable 'res' [-Wmissing-variable-declarations]
69 | int res;
| ^
conftest.c:69:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
69 | int res;
| ^
1 warning generated.
configure:15565: $? = 0
configure:15573: result: yes
configure:15591: checking for backtrace_symbols
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for copyfile
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-646232.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `copyfile'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| /* end confdefs.h. */
| /* Define copyfile to an innocuous variant, in case <limits.h> declares copyfile.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define copyfile innocuous_copyfile
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char copyfile (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef copyfile
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char copyfile ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_copyfile || defined __stub___copyfile
| choke me
| #endif
|
| int
| main ()
| {
| return copyfile ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for copy_file_range
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for elf_aux_info
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-44bf57.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `elf_aux_info'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| /* end confdefs.h. */
| /* Define elf_aux_info to an innocuous variant, in case <limits.h> declares elf_aux_info.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define elf_aux_info innocuous_elf_aux_info
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char elf_aux_info (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef elf_aux_info
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char elf_aux_info ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_elf_aux_info || defined __stub___elf_aux_info
| choke me
| #endif
|
| int
| main ()
| {
| return elf_aux_info ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for getauxval
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for getifaddrs
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for getpeerucred
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-ff70ac.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `getpeerucred'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| /* end confdefs.h. */
| /* Define getpeerucred to an innocuous variant, in case <limits.h> declares getpeerucred.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define getpeerucred innocuous_getpeerucred
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char getpeerucred (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef getpeerucred
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char getpeerucred ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_getpeerucred || defined __stub___getpeerucred
| choke me
| #endif
|
| int
| main ()
| {
| return getpeerucred ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for inet_pton
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for kqueue
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-662591.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `kqueue'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| /* end confdefs.h. */
| /* Define kqueue to an innocuous variant, in case <limits.h> declares kqueue.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define kqueue innocuous_kqueue
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char kqueue (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef kqueue
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char kqueue ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_kqueue || defined __stub___kqueue
| choke me
| #endif
|
| int
| main ()
| {
| return kqueue ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for localeconv_l
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-bbc8fd.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `localeconv_l'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| /* end confdefs.h. */
| /* Define localeconv_l to an innocuous variant, in case <limits.h> declares localeconv_l.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define localeconv_l innocuous_localeconv_l
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char localeconv_l (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef localeconv_l
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char localeconv_l ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_localeconv_l || defined __stub___localeconv_l
| choke me
| #endif
|
| int
| main ()
| {
| return localeconv_l ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for mbstowcs_l
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-599ca2.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `mbstowcs_l'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| /* end confdefs.h. */
| /* Define mbstowcs_l to an innocuous variant, in case <limits.h> declares mbstowcs_l.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define mbstowcs_l innocuous_mbstowcs_l
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char mbstowcs_l (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef mbstowcs_l
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char mbstowcs_l ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_mbstowcs_l || defined __stub___mbstowcs_l
| choke me
| #endif
|
| int
| main ()
| {
| return mbstowcs_l ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for posix_fallocate
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for ppoll
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for pthread_is_threaded_np
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-5d0a93.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `pthread_is_threaded_np'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| /* end confdefs.h. */
| /* Define pthread_is_threaded_np to an innocuous variant, in case <limits.h> declares pthread_is_threaded_np.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define pthread_is_threaded_np innocuous_pthread_is_threaded_np
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char pthread_is_threaded_np (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef pthread_is_threaded_np
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char pthread_is_threaded_np ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_pthread_is_threaded_np || defined __stub___pthread_is_threaded_np
| choke me
| #endif
|
| int
| main ()
| {
| return pthread_is_threaded_np ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for setproctitle
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-d250e6.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `setproctitle'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| /* end confdefs.h. */
| /* Define setproctitle to an innocuous variant, in case <limits.h> declares setproctitle.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define setproctitle innocuous_setproctitle
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char setproctitle (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef setproctitle
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char setproctitle ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_setproctitle || defined __stub___setproctitle
| choke me
| #endif
|
| int
| main ()
| {
| return setproctitle ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for setproctitle_fast
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-fe0671.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `setproctitle_fast'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| /* end confdefs.h. */
| /* Define setproctitle_fast to an innocuous variant, in case <limits.h> declares setproctitle_fast.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define setproctitle_fast innocuous_setproctitle_fast
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char setproctitle_fast (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef setproctitle_fast
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char setproctitle_fast ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_setproctitle_fast || defined __stub___setproctitle_fast
| choke me
| #endif
|
| int
| main ()
| {
| return setproctitle_fast ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15591: checking for strsignal
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for syncfs
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for sync_file_range
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for uselocale
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15591: $? = 0
configure:15591: result: yes
configure:15591: checking for wcstombs_l
configure:15591: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-467721.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `wcstombs_l'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:15591: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| /* end confdefs.h. */
| /* Define wcstombs_l to an innocuous variant, in case <limits.h> declares wcstombs_l.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define wcstombs_l innocuous_wcstombs_l
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char wcstombs_l (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef wcstombs_l
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char wcstombs_l ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_wcstombs_l || defined __stub___wcstombs_l
| choke me
| #endif
|
| int
| main ()
| {
| return wcstombs_l ();
| ;
| return 0;
| }
configure:15591: result: no
configure:15602: checking for __builtin_bswap16
configure:15623: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
conftest.c:82:1: warning: no previous prototype for function 'call__builtin_bswap16' [-Wmissing-prototypes]
82 | call__builtin_bswap16(int x)
| ^
conftest.c:81:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
81 | int
| ^
| static
1 warning generated.
configure:15623: $? = 0
configure:15631: result: yes
configure:15640: checking for __builtin_bswap32
configure:15661: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
conftest.c:83:1: warning: no previous prototype for function 'call__builtin_bswap32' [-Wmissing-prototypes]
83 | call__builtin_bswap32(int x)
| ^
conftest.c:82:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
82 | int
| ^
| static
1 warning generated.
configure:15661: $? = 0
configure:15669: result: yes
configure:15678: checking for __builtin_bswap64
configure:15699: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
conftest.c:84:1: warning: no previous prototype for function 'call__builtin_bswap64' [-Wmissing-prototypes]
84 | call__builtin_bswap64(long int x)
| ^
conftest.c:83:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
83 | int
| ^
| static
1 warning generated.
configure:15699: $? = 0
configure:15707: result: yes
configure:15717: checking for __builtin_clz
configure:15738: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
conftest.c:85:1: warning: no previous prototype for function 'call__builtin_clz' [-Wmissing-prototypes]
85 | call__builtin_clz(unsigned int x)
| ^
conftest.c:84:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
84 | int
| ^
| static
1 warning generated.
configure:15738: $? = 0
configure:15746: result: yes
configure:15755: checking for __builtin_ctz
configure:15776: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
conftest.c:86:1: warning: no previous prototype for function 'call__builtin_ctz' [-Wmissing-prototypes]
86 | call__builtin_ctz(unsigned int x)
| ^
conftest.c:85:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
85 | int
| ^
| static
1 warning generated.
configure:15776: $? = 0
configure:15784: result: yes
configure:15793: checking for __builtin_popcount
configure:15814: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
conftest.c:87:1: warning: no previous prototype for function 'call__builtin_popcount' [-Wmissing-prototypes]
87 | call__builtin_popcount(unsigned int x)
| ^
conftest.c:86:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
86 | int
| ^
| static
1 warning generated.
configure:15814: $? = 0
configure:15822: result: yes
configure:15833: checking for __builtin_frame_address
configure:15854: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
conftest.c:88:1: warning: no previous prototype for function 'call__builtin_frame_address' [-Wmissing-prototypes]
88 | call__builtin_frame_address(void)
| ^
conftest.c:87:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
87 | void *
| ^
| static
1 warning generated.
configure:15854: $? = 0
configure:15862: result: yes
configure:15874: checking for _LARGEFILE_SOURCE value needed for large files
configure:15893: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:15893: $? = 0
configure:15921: result: no
configure:15955: checking how clang reports undeclared, standard C functions
configure:15971: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:92:8: error: call to undeclared library function 'strchr' with type 'char *(const char *, int)'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
92 | (void) strchr;
| ^
conftest.c:92:8: note: include the header <string.h> or explicitly provide a declaration for 'strchr'
1 error generated.
configure:15971: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| /* end confdefs.h. */
|
| int
| main ()
| {
| (void) strchr;
| ;
| return 0;
| }
configure:16026: result: error
configure:16038: checking for posix_fadvise
configure:16038: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16038: $? = 0
configure:16038: result: yes
configure:16047: checking whether posix_fadvise is declared
configure:16047: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16047: $? = 0
configure:16047: result: yes
configure:16062: checking whether fdatasync is declared
configure:16062: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16062: $? = 0
configure:16062: result: yes
configure:16074: checking whether strlcat is declared
configure:16074: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:131:10: error: use of undeclared identifier 'strlcat'; did you mean 'struct'?
131 | (void) strlcat;
| ^~~~~~~
| struct
conftest.c:131:10: error: expected expression
2 errors generated.
configure:16074: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| int
| main ()
| {
| #ifndef strlcat
| #ifdef __cplusplus
| (void) strlcat;
| #else
| (void) strlcat;
| #endif
| #endif
|
| ;
| return 0;
| }
configure:16074: result: no
configure:16084: checking whether strlcpy is declared
configure:16084: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:132:10: error: use of undeclared identifier 'strlcpy'; did you mean 'struct'?
132 | (void) strlcpy;
| ^~~~~~~
| struct
conftest.c:132:10: error: expected expression
2 errors generated.
configure:16084: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| int
| main ()
| {
| #ifndef strlcpy
| #ifdef __cplusplus
| (void) strlcpy;
| #else
| (void) strlcpy;
| #endif
| #endif
|
| ;
| return 0;
| }
configure:16084: result: no
configure:16094: checking whether strnlen is declared
configure:16094: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16094: $? = 0
configure:16094: result: yes
configure:16104: checking whether strsep is declared
configure:16104: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16104: $? = 0
configure:16104: result: yes
configure:16114: checking whether timingsafe_bcmp is declared
configure:16114: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:135:10: error: use of undeclared identifier 'timingsafe_bcmp'
135 | (void) timingsafe_bcmp;
| ^
1 error generated.
configure:16114: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| /* end confdefs.h. */
| #include <stdio.h>
| #ifdef HAVE_SYS_TYPES_H
| # include <sys/types.h>
| #endif
| #ifdef HAVE_SYS_STAT_H
| # include <sys/stat.h>
| #endif
| #ifdef STDC_HEADERS
| # include <stdlib.h>
| # include <stddef.h>
| #else
| # ifdef HAVE_STDLIB_H
| # include <stdlib.h>
| # endif
| #endif
| #ifdef HAVE_STRING_H
| # if !defined STDC_HEADERS && defined HAVE_MEMORY_H
| # include <memory.h>
| # endif
| # include <string.h>
| #endif
| #ifdef HAVE_STRINGS_H
| # include <strings.h>
| #endif
| #ifdef HAVE_INTTYPES_H
| # include <inttypes.h>
| #endif
| #ifdef HAVE_STDINT_H
| # include <stdint.h>
| #endif
| #ifdef HAVE_UNISTD_H
| # include <unistd.h>
| #endif
| int
| main ()
| {
| #ifndef timingsafe_bcmp
| #ifdef __cplusplus
| (void) timingsafe_bcmp;
| #else
| (void) timingsafe_bcmp;
| #endif
| #endif
|
| ;
| return 0;
| }
configure:16114: result: no
configure:16128: checking whether preadv is declared
configure:16128: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16128: $? = 0
configure:16128: result: yes
configure:16140: checking whether pwritev is declared
configure:16140: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16140: $? = 0
configure:16140: result: yes
configure:16152: checking whether strchrnul is declared
configure:16152: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16152: $? = 0
configure:16152: result: yes
configure:16164: checking whether memset_s is declared
configure:16164: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:109:10: error: use of undeclared identifier 'memset_s'
109 | (void) memset_s;
| ^
1 error generated.
configure:16164: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| /* end confdefs.h. */
| #define __STDC_WANT_LIB_EXT1__ 1
| #include <string.h>
|
| int
| main ()
| {
| #ifndef memset_s
| #ifdef __cplusplus
| (void) memset_s;
| #else
| (void) memset_s;
| #endif
| #endif
|
| ;
| return 0;
| }
configure:16164: result: no
configure:16179: checking whether F_FULLFSYNC is declared
configure:16179: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:109:10: error: use of undeclared identifier 'F_FULLFSYNC'
109 | (void) F_FULLFSYNC;
| ^
1 error generated.
configure:16179: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| /* end confdefs.h. */
| #include <fcntl.h>
|
| int
| main ()
| {
| #ifndef F_FULLFSYNC
| #ifdef __cplusplus
| (void) F_FULLFSYNC;
| #else
| (void) F_FULLFSYNC;
| #endif
| #endif
|
| ;
| return 0;
| }
configure:16179: result: no
configure:16192: checking for explicit_bzero
configure:16192: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16192: $? = 0
configure:16192: result: yes
configure:16205: checking for getopt
configure:16205: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16205: $? = 0
configure:16205: result: yes
configure:16218: checking for getpeereid
configure:16218: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-033287.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `getpeereid'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:16218: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| /* end confdefs.h. */
| /* Define getpeereid to an innocuous variant, in case <limits.h> declares getpeereid.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define getpeereid innocuous_getpeereid
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char getpeereid (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef getpeereid
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char getpeereid ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_getpeereid || defined __stub___getpeereid
| choke me
| #endif
|
| int
| main ()
| {
| return getpeereid ();
| ;
| return 0;
| }
configure:16218: result: no
configure:16231: checking for inet_aton
configure:16231: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16231: $? = 0
configure:16231: result: yes
configure:16244: checking for mkdtemp
configure:16244: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16244: $? = 0
configure:16244: result: yes
configure:16257: checking for strlcat
configure:16257: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-cc9328.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `strlcat'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:16257: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| /* end confdefs.h. */
| /* Define strlcat to an innocuous variant, in case <limits.h> declares strlcat.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define strlcat innocuous_strlcat
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char strlcat (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef strlcat
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char strlcat ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_strlcat || defined __stub___strlcat
| choke me
| #endif
|
| int
| main ()
| {
| return strlcat ();
| ;
| return 0;
| }
configure:16257: result: no
configure:16270: checking for strlcpy
configure:16270: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-4b526e.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `strlcpy'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:16270: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| /* end confdefs.h. */
| /* Define strlcpy to an innocuous variant, in case <limits.h> declares strlcpy.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define strlcpy innocuous_strlcpy
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char strlcpy (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef strlcpy
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char strlcpy ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_strlcpy || defined __stub___strlcpy
| choke me
| #endif
|
| int
| main ()
| {
| return strlcpy ();
| ;
| return 0;
| }
configure:16270: result: no
configure:16283: checking for strnlen
configure:16283: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16283: $? = 0
configure:16283: result: yes
configure:16296: checking for strsep
configure:16296: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16296: $? = 0
configure:16296: result: yes
configure:16309: checking for timingsafe_bcmp
configure:16309: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-b31644.o: in function `main':
conftest.c:(.text+0x12): undefined reference to `timingsafe_bcmp'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:16309: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| /* end confdefs.h. */
| /* Define timingsafe_bcmp to an innocuous variant, in case <limits.h> declares timingsafe_bcmp.
| For example, HP-UX 11i <limits.h> declares gettimeofday. */
| #define timingsafe_bcmp innocuous_timingsafe_bcmp
|
| /* System header to define __stub macros and hopefully few prototypes,
| which can conflict with char timingsafe_bcmp (); below.
| Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
| <limits.h> exists even on freestanding compilers. */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef timingsafe_bcmp
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char timingsafe_bcmp ();
| /* The GNU C library defines this for functions which it implements
| to always fail with ENOSYS. Some functions are actually named
| something starting with __ and the normal name is an alias. */
| #if defined __stub_timingsafe_bcmp || defined __stub___timingsafe_bcmp
| choke me
| #endif
|
| int
| main ()
| {
| return timingsafe_bcmp ();
| ;
| return 0;
| }
configure:16309: result: no
configure:16324: checking for pthread_barrier_wait
configure:16324: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16324: $? = 0
configure:16324: result: yes
configure:16356: checking for getopt_long
configure:16356: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16356: $? = 0
configure:16356: result: yes
configure:16528: checking for syslog
configure:16528: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16528: $? = 0
configure:16528: result: yes
configure:16530: checking syslog.h usability
configure:16530: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16530: $? = 0
configure:16530: result: yes
configure:16530: checking syslog.h presence
configure:16530: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:16530: $? = 0
configure:16530: result: yes
configure:16530: checking for syslog.h
configure:16530: result: yes
configure:16541: checking for opterr
configure:16557: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
configure:16557: $? = 0
configure:16565: result: yes
configure:16573: checking for optreset
configure:16589: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lpthread -lrt -ldl -lm >&5
/usr/bin/ld: /usr/bin/ld: DWARF error: invalid or unhandled FORM value: 0x25
/tmp/conftest-06cada.o: in function `main':
conftest.c:(.text+0xe): undefined reference to `optreset'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:16589: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| /* end confdefs.h. */
| #include <unistd.h>
| int
| main ()
| {
| extern int optreset; optreset = 1;
| ;
| return 0;
| }
configure:16597: result: no
configure:16610: checking unicode/ucol.h usability
configure:16610: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
configure:16610: $? = 0
configure:16610: result: yes
configure:16610: checking unicode/ucol.h presence
configure:16610: clang -E -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c
configure:16610: $? = 0
configure:16610: result: yes
configure:16610: checking for unicode/ucol.h
configure:16610: result: yes
configure:16658: checking for rl_completion_suppress_quote
configure:16682: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16682: $? = 0
configure:16690: result: yes
configure:16697: checking for rl_filename_quote_characters
configure:16721: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16721: $? = 0
configure:16729: result: yes
configure:16736: checking for rl_filename_quoting_function
configure:16760: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16760: $? = 0
configure:16768: result: yes
configure:16779: checking for append_history
configure:16779: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16779: $? = 0
configure:16779: result: yes
configure:16779: checking for history_truncate_file
configure:16779: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16779: $? = 0
configure:16779: result: yes
configure:16779: checking for rl_completion_matches
configure:16779: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16779: $? = 0
configure:16779: result: yes
configure:16779: checking for rl_filename_completion_function
configure:16779: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16779: $? = 0
configure:16779: result: yes
configure:16779: checking for rl_reset_screen_size
configure:16779: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16779: $? = 0
configure:16779: result: yes
configure:16779: checking for rl_variable_bind
configure:16779: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16779: $? = 0
configure:16779: result: yes
configure:16795: checking test program
configure:16805: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16805: $? = 0
configure:16805: ./conftest
configure:16805: $? = 0
configure:16806: result: ok
configure:16830: checking size of void *
configure:16835: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16835: $? = 0
configure:16835: ./conftest
configure:16835: $? = 0
configure:16849: result: 8
configure:16863: checking size of size_t
configure:16868: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16868: $? = 0
configure:16868: ./conftest
configure:16868: $? = 0
configure:16882: result: 8
configure:16896: checking size of long
configure:16901: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16901: $? = 0
configure:16901: ./conftest
configure:16901: $? = 0
configure:16915: result: 8
configure:16929: checking size of long long
configure:16934: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16934: $? = 0
configure:16934: ./conftest
configure:16934: $? = 0
configure:16948: result: 8
configure:16963: checking alignment of short
configure:16968: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:16968: $? = 0
configure:16968: ./conftest
configure:16968: $? = 0
configure:16986: result: 2
configure:16998: checking alignment of int
configure:17003: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17003: $? = 0
configure:17003: ./conftest
configure:17003: $? = 0
configure:17021: result: 4
configure:17033: checking alignment of long
configure:17038: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17038: $? = 0
configure:17038: ./conftest
configure:17038: $? = 0
configure:17056: result: 8
configure:17068: checking alignment of int64_t
configure:17073: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17073: $? = 0
configure:17073: ./conftest
configure:17073: $? = 0
configure:17091: result: 8
configure:17103: checking alignment of double
configure:17108: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17108: $? = 0
configure:17108: ./conftest
configure:17108: $? = 0
configure:17126: result: 8
configure:17168: checking for __int128
configure:17205: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:139:10: warning: no previous extern declaration for non-static variable 'a' [-Wmissing-variable-declarations]
139 | __int128 a = 48828125;
| ^
conftest.c:139:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
139 | __int128 a = 48828125;
| ^
conftest.c:140:10: warning: no previous extern declaration for non-static variable 'b' [-Wmissing-variable-declarations]
140 | __int128 b = 97656250;
| ^
conftest.c:140:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
140 | __int128 b = 97656250;
| ^
2 warnings generated.
configure:17205: $? = 0
configure:17213: result: yes
configure:17220: checking for __int128 alignment bug
configure:17260: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:143:6: warning: no previous prototype for function 'pass_by_val' [-Wmissing-prototypes]
143 | void pass_by_val(void *buffer, int128a par) { holder = par; }
| ^
conftest.c:143:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
143 | void pass_by_val(void *buffer, int128a par) { holder = par; }
| ^
| static
conftest.c:142:9: warning: no previous extern declaration for non-static variable 'holder' [-Wmissing-variable-declarations]
142 | int128a holder;
| ^
conftest.c:142:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
142 | int128a holder;
| ^
2 warnings generated.
configure:17260: $? = 0
configure:17260: ./conftest
configure:17260: $? = 0
configure:17270: result: ok
configure:17278: checking alignment of PG_INT128_TYPE
configure:17283: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17283: $? = 0
configure:17283: ./conftest
configure:17283: $? = 0
configure:17301: result: 16
configure:17316: checking for builtin __sync char locking functions
configure:17334: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17334: $? = 0
configure:17342: result: yes
configure:17349: checking for builtin __sync int32 locking functions
configure:17367: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17367: $? = 0
configure:17375: result: yes
configure:17382: checking for builtin __sync int32 atomic operations
configure:17399: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17399: $? = 0
configure:17407: result: yes
configure:17414: checking for builtin __sync int64 atomic operations
configure:17431: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17431: $? = 0
configure:17439: result: yes
configure:17446: checking for builtin __atomic int32 atomic operations
configure:17464: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17464: $? = 0
configure:17472: result: yes
configure:17479: checking for builtin __atomic int64 atomic operations
configure:17497: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17497: $? = 0
configure:17505: result: yes
configure:17515: checking for __get_cpuid
configure:17533: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17533: $? = 0
configure:17541: result: yes
configure:17549: checking for __get_cpuid_count
configure:17567: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17567: $? = 0
configure:17575: result: yes
configure:17583: checking for __cpuid
configure:17601: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
In file included from conftest.c:140:
/usr/local/lib/clang/18/include/intrin.h:12:15: fatal error: 'intrin.h' file not found
12 | #include_next <intrin.h>
| ^~~~~~~~~~
1 error generated.
configure:17601: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| #define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
| #define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
| #define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
| #define HAVE_APPEND_HISTORY 1
| #define HAVE_HISTORY_TRUNCATE_FILE 1
| #define HAVE_RL_COMPLETION_MATCHES 1
| #define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
| #define HAVE_RL_RESET_SCREEN_SIZE 1
| #define HAVE_RL_VARIABLE_BIND 1
| #define SIZEOF_VOID_P 8
| #define SIZEOF_SIZE_T 8
| #define SIZEOF_LONG 8
| #define SIZEOF_LONG_LONG 8
| #define ALIGNOF_SHORT 2
| #define ALIGNOF_INT 4
| #define ALIGNOF_LONG 8
| #define ALIGNOF_INT64_T 8
| #define ALIGNOF_DOUBLE 8
| #define MAXIMUM_ALIGNOF 8
| #define PG_INT128_TYPE __int128
| #define ALIGNOF_PG_INT128_TYPE 16
| #define HAVE_GCC__SYNC_CHAR_TAS 1
| #define HAVE_GCC__SYNC_INT32_TAS 1
| #define HAVE_GCC__SYNC_INT32_CAS 1
| #define HAVE_GCC__SYNC_INT64_CAS 1
| #define HAVE_GCC__ATOMIC_INT32_CAS 1
| #define HAVE_GCC__ATOMIC_INT64_CAS 1
| #define HAVE__GET_CPUID 1
| #define HAVE__GET_CPUID_COUNT 1
| /* end confdefs.h. */
| #include <intrin.h>
| int
| main ()
| {
| unsigned int exx[4] = {0, 0, 0, 0};
| __get_cpuid(exx[0], 1);
|
| ;
| return 0;
| }
configure:17609: result: no
configure:17617: checking for __cpuidex
configure:17635: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
In file included from conftest.c:140:
/usr/local/lib/clang/18/include/intrin.h:12:15: fatal error: 'intrin.h' file not found
12 | #include_next <intrin.h>
| ^~~~~~~~~~
1 error generated.
configure:17635: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| #define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
| #define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
| #define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
| #define HAVE_APPEND_HISTORY 1
| #define HAVE_HISTORY_TRUNCATE_FILE 1
| #define HAVE_RL_COMPLETION_MATCHES 1
| #define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
| #define HAVE_RL_RESET_SCREEN_SIZE 1
| #define HAVE_RL_VARIABLE_BIND 1
| #define SIZEOF_VOID_P 8
| #define SIZEOF_SIZE_T 8
| #define SIZEOF_LONG 8
| #define SIZEOF_LONG_LONG 8
| #define ALIGNOF_SHORT 2
| #define ALIGNOF_INT 4
| #define ALIGNOF_LONG 8
| #define ALIGNOF_INT64_T 8
| #define ALIGNOF_DOUBLE 8
| #define MAXIMUM_ALIGNOF 8
| #define PG_INT128_TYPE __int128
| #define ALIGNOF_PG_INT128_TYPE 16
| #define HAVE_GCC__SYNC_CHAR_TAS 1
| #define HAVE_GCC__SYNC_INT32_TAS 1
| #define HAVE_GCC__SYNC_INT32_CAS 1
| #define HAVE_GCC__SYNC_INT64_CAS 1
| #define HAVE_GCC__ATOMIC_INT32_CAS 1
| #define HAVE_GCC__ATOMIC_INT64_CAS 1
| #define HAVE__GET_CPUID 1
| #define HAVE__GET_CPUID_COUNT 1
| /* end confdefs.h. */
| #include <intrin.h>
| int
| main ()
| {
| unsigned int exx[4] = {0, 0, 0, 0};
| __get_cpuidex(exx[0], 7, 0);
|
| ;
| return 0;
| }
configure:17643: result: no
configure:17653: checking for _xgetbv
configure:17676: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:17676: $? = 0
configure:17684: result: yes
configure:17699: checking for _mm512_popcnt_epi64
configure:17731: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:143:10: warning: no previous extern declaration for non-static variable 'buf' [-Wmissing-variable-declarations]
143 | char buf[sizeof(__m512i)];
| ^
conftest.c:143:5: note: declare 'static' if the variable is not intended to be used outside of this translation unit
143 | char buf[sizeof(__m512i)];
| ^
1 warning generated.
configure:17731: $? = 0
configure:17739: result: yes
configure:17825: checking for _mm_crc32_u8 and _mm_crc32_u32
configure:17852: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:143:18: warning: no previous extern declaration for non-static variable 'crc' [-Wmissing-variable-declarations]
143 | unsigned int crc;
| ^
conftest.c:143:5: note: declare 'static' if the variable is not intended to be used outside of this translation unit
143 | unsigned int crc;
| ^
1 warning generated.
configure:17852: $? = 0
configure:17860: result: yes
configure:17884: clang -c -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c >&5
conftest.c:148:2: error: __SSE4_2__ not defined
148 | #error __SSE4_2__ not defined
| ^
1 error generated.
configure:17884: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| #define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
| #define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
| #define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
| #define HAVE_APPEND_HISTORY 1
| #define HAVE_HISTORY_TRUNCATE_FILE 1
| #define HAVE_RL_COMPLETION_MATCHES 1
| #define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
| #define HAVE_RL_RESET_SCREEN_SIZE 1
| #define HAVE_RL_VARIABLE_BIND 1
| #define SIZEOF_VOID_P 8
| #define SIZEOF_SIZE_T 8
| #define SIZEOF_LONG 8
| #define SIZEOF_LONG_LONG 8
| #define ALIGNOF_SHORT 2
| #define ALIGNOF_INT 4
| #define ALIGNOF_LONG 8
| #define ALIGNOF_INT64_T 8
| #define ALIGNOF_DOUBLE 8
| #define MAXIMUM_ALIGNOF 8
| #define PG_INT128_TYPE __int128
| #define ALIGNOF_PG_INT128_TYPE 16
| #define HAVE_GCC__SYNC_CHAR_TAS 1
| #define HAVE_GCC__SYNC_INT32_TAS 1
| #define HAVE_GCC__SYNC_INT32_CAS 1
| #define HAVE_GCC__SYNC_INT64_CAS 1
| #define HAVE_GCC__ATOMIC_INT32_CAS 1
| #define HAVE_GCC__ATOMIC_INT64_CAS 1
| #define HAVE__GET_CPUID 1
| #define HAVE__GET_CPUID_COUNT 1
| #define HAVE_XSAVE_INTRINSICS 1
| #define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1
| /* end confdefs.h. */
|
| int
| main ()
| {
|
| #ifndef __SSE4_2__
| #error __SSE4_2__ not defined
| #endif
|
| ;
| return 0;
| }
configure:17895: checking for __crc32cb, __crc32ch, __crc32cw, and __crc32cd with CFLAGS=
configure:17919: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
In file included from conftest.c:142:
/usr/local/lib/clang/18/include/arm_acle.h:21:2: error: "ACLE intrinsics support not enabled."
21 | #error "ACLE intrinsics support not enabled."
| ^
conftest.c:147:7: error: call to undeclared function '__crc32cb'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
147 | crc = __crc32cb(crc, 0);
| ^
conftest.c:148:10: error: call to undeclared function '__crc32ch'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
148 | crc = __crc32ch(crc, 0);
| ^
conftest.c:149:10: error: call to undeclared function '__crc32cw'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
149 | crc = __crc32cw(crc, 0);
| ^
conftest.c:150:10: error: call to undeclared function '__crc32cd'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
150 | crc = __crc32cd(crc, 0);
| ^
conftest.c:143:14: warning: no previous extern declaration for non-static variable 'crc' [-Wmissing-variable-declarations]
143 | unsigned int crc;
| ^
conftest.c:143:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
143 | unsigned int crc;
| ^
1 warning and 5 errors generated.
configure:17919: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| #define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
| #define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
| #define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
| #define HAVE_APPEND_HISTORY 1
| #define HAVE_HISTORY_TRUNCATE_FILE 1
| #define HAVE_RL_COMPLETION_MATCHES 1
| #define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
| #define HAVE_RL_RESET_SCREEN_SIZE 1
| #define HAVE_RL_VARIABLE_BIND 1
| #define SIZEOF_VOID_P 8
| #define SIZEOF_SIZE_T 8
| #define SIZEOF_LONG 8
| #define SIZEOF_LONG_LONG 8
| #define ALIGNOF_SHORT 2
| #define ALIGNOF_INT 4
| #define ALIGNOF_LONG 8
| #define ALIGNOF_INT64_T 8
| #define ALIGNOF_DOUBLE 8
| #define MAXIMUM_ALIGNOF 8
| #define PG_INT128_TYPE __int128
| #define ALIGNOF_PG_INT128_TYPE 16
| #define HAVE_GCC__SYNC_CHAR_TAS 1
| #define HAVE_GCC__SYNC_INT32_TAS 1
| #define HAVE_GCC__SYNC_INT32_CAS 1
| #define HAVE_GCC__SYNC_INT64_CAS 1
| #define HAVE_GCC__ATOMIC_INT32_CAS 1
| #define HAVE_GCC__ATOMIC_INT64_CAS 1
| #define HAVE__GET_CPUID 1
| #define HAVE__GET_CPUID_COUNT 1
| #define HAVE_XSAVE_INTRINSICS 1
| #define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1
| /* end confdefs.h. */
| #include <arm_acle.h>
| unsigned int crc;
| int
| main ()
| {
| crc = __crc32cb(crc, 0);
| crc = __crc32ch(crc, 0);
| crc = __crc32cw(crc, 0);
| crc = __crc32cd(crc, 0);
| /* return computed value, to prevent the above being optimized away */
| return crc == 0;
| ;
| return 0;
| }
configure:17928: result: no
configure:17936: checking for __crc32cb, __crc32ch, __crc32cw, and __crc32cd with CFLAGS=-march=armv8-a+crc+simd
configure:17960: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -march=armv8-a+crc+simd -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
error: unknown target CPU 'armv8-a+crc+simd'
note: valid target CPU values are: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cooperlake, cannonlake, icelake-client, rocketlake, icelake-server, tigerlake, sapphirerapids, alderlake, raptorlake, meteorlake, arrowlake, arrowlake-s, lunarlake, gracemont, pantherlake, sierraforest, grandridge, graniterapids, graniterapids-d, emeraldrapids, clearwaterforest, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, znver3, znver4, x86-64, x86-64-v2, x86-64-v3, x86-64-v4
configure:17960: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| #define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
| #define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
| #define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
| #define HAVE_APPEND_HISTORY 1
| #define HAVE_HISTORY_TRUNCATE_FILE 1
| #define HAVE_RL_COMPLETION_MATCHES 1
| #define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
| #define HAVE_RL_RESET_SCREEN_SIZE 1
| #define HAVE_RL_VARIABLE_BIND 1
| #define SIZEOF_VOID_P 8
| #define SIZEOF_SIZE_T 8
| #define SIZEOF_LONG 8
| #define SIZEOF_LONG_LONG 8
| #define ALIGNOF_SHORT 2
| #define ALIGNOF_INT 4
| #define ALIGNOF_LONG 8
| #define ALIGNOF_INT64_T 8
| #define ALIGNOF_DOUBLE 8
| #define MAXIMUM_ALIGNOF 8
| #define PG_INT128_TYPE __int128
| #define ALIGNOF_PG_INT128_TYPE 16
| #define HAVE_GCC__SYNC_CHAR_TAS 1
| #define HAVE_GCC__SYNC_INT32_TAS 1
| #define HAVE_GCC__SYNC_INT32_CAS 1
| #define HAVE_GCC__SYNC_INT64_CAS 1
| #define HAVE_GCC__ATOMIC_INT32_CAS 1
| #define HAVE_GCC__ATOMIC_INT64_CAS 1
| #define HAVE__GET_CPUID 1
| #define HAVE__GET_CPUID_COUNT 1
| #define HAVE_XSAVE_INTRINSICS 1
| #define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1
| /* end confdefs.h. */
| #include <arm_acle.h>
| unsigned int crc;
| int
| main ()
| {
| crc = __crc32cb(crc, 0);
| crc = __crc32ch(crc, 0);
| crc = __crc32cw(crc, 0);
| crc = __crc32cd(crc, 0);
| /* return computed value, to prevent the above being optimized away */
| return crc == 0;
| ;
| return 0;
| }
configure:17969: result: no
configure:17977: checking for __crc32cb, __crc32ch, __crc32cw, and __crc32cd with CFLAGS=-march=armv8-a+crc
configure:18001: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -march=armv8-a+crc -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
error: unknown target CPU 'armv8-a+crc'
note: valid target CPU values are: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cooperlake, cannonlake, icelake-client, rocketlake, icelake-server, tigerlake, sapphirerapids, alderlake, raptorlake, meteorlake, arrowlake, arrowlake-s, lunarlake, gracemont, pantherlake, sierraforest, grandridge, graniterapids, graniterapids-d, emeraldrapids, clearwaterforest, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, znver3, znver4, x86-64, x86-64-v2, x86-64-v3, x86-64-v4
configure:18001: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| #define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
| #define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
| #define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
| #define HAVE_APPEND_HISTORY 1
| #define HAVE_HISTORY_TRUNCATE_FILE 1
| #define HAVE_RL_COMPLETION_MATCHES 1
| #define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
| #define HAVE_RL_RESET_SCREEN_SIZE 1
| #define HAVE_RL_VARIABLE_BIND 1
| #define SIZEOF_VOID_P 8
| #define SIZEOF_SIZE_T 8
| #define SIZEOF_LONG 8
| #define SIZEOF_LONG_LONG 8
| #define ALIGNOF_SHORT 2
| #define ALIGNOF_INT 4
| #define ALIGNOF_LONG 8
| #define ALIGNOF_INT64_T 8
| #define ALIGNOF_DOUBLE 8
| #define MAXIMUM_ALIGNOF 8
| #define PG_INT128_TYPE __int128
| #define ALIGNOF_PG_INT128_TYPE 16
| #define HAVE_GCC__SYNC_CHAR_TAS 1
| #define HAVE_GCC__SYNC_INT32_TAS 1
| #define HAVE_GCC__SYNC_INT32_CAS 1
| #define HAVE_GCC__SYNC_INT64_CAS 1
| #define HAVE_GCC__ATOMIC_INT32_CAS 1
| #define HAVE_GCC__ATOMIC_INT64_CAS 1
| #define HAVE__GET_CPUID 1
| #define HAVE__GET_CPUID_COUNT 1
| #define HAVE_XSAVE_INTRINSICS 1
| #define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1
| /* end confdefs.h. */
| #include <arm_acle.h>
| unsigned int crc;
| int
| main ()
| {
| crc = __crc32cb(crc, 0);
| crc = __crc32ch(crc, 0);
| crc = __crc32cw(crc, 0);
| crc = __crc32cd(crc, 0);
| /* return computed value, to prevent the above being optimized away */
| return crc == 0;
| ;
| return 0;
| }
configure:18010: result: no
configure:18024: checking for __builtin_loongarch_crcc_w_b_w, __builtin_loongarch_crcc_w_h_w, __builtin_loongarch_crcc_w_w_w and __builtin_loongarch_crcc_w_d_w
configure:18045: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:146:7: error: use of unknown builtin '__builtin_loongarch_crcc_w_b_w' [-Wimplicit-function-declaration]
146 | crc = __builtin_loongarch_crcc_w_b_w(0, crc);
| ^
conftest.c:147:10: error: use of unknown builtin '__builtin_loongarch_crcc_w_h_w' [-Wimplicit-function-declaration]
147 | crc = __builtin_loongarch_crcc_w_h_w(0, crc);
| ^
conftest.c:148:10: error: use of unknown builtin '__builtin_loongarch_crcc_w_w_w' [-Wimplicit-function-declaration]
148 | crc = __builtin_loongarch_crcc_w_w_w(0, crc);
| ^
conftest.c:149:10: error: use of unknown builtin '__builtin_loongarch_crcc_w_d_w' [-Wimplicit-function-declaration]
149 | crc = __builtin_loongarch_crcc_w_d_w(0, crc);
| ^
conftest.c:142:14: warning: no previous extern declaration for non-static variable 'crc' [-Wmissing-variable-declarations]
142 | unsigned int crc;
| ^
conftest.c:142:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit
142 | unsigned int crc;
| ^
1 warning and 4 errors generated.
configure:18045: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "18beta1"
| #define PACKAGE_STRING "PostgreSQL 18beta1"
| #define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
| #define PACKAGE_URL "https://www.postgresql.org/"
| #define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
| #define PG_MAJORVERSION "18"
| #define PG_MAJORVERSION_NUM 18
| #define PG_MINORVERSION_NUM 0
| #define PG_VERSION "18beta1"
| #define DEF_PGPORT 7432
| #define DEF_PGPORT_STR "7432"
| #define BLCKSZ 8192
| #define RELSEG_SIZE 131072
| #define XLOG_BLCKSZ 8192
| #define HAVE_VISIBILITY_ATTRIBUTE 1
| #define DLSUFFIX ".so"
| #define USE_ASSERT_CHECKING 1
| #define USE_ICU 1
| #define PG_KRB_SRVNAM "postgres"
| #define USE_LIBXML 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_PTHREAD_PRIO_INHERIT 1
| #define HAVE_PTHREAD 1
| #define HAVE_STRERROR_R 1
| #define HAVE_LIBM 1
| #define HAVE_LIBREADLINE 1
| #define HAVE_LIBZ 1
| #define HAVE_LIBXML2 1
| #define HAVE_EXECINFO_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IFADDRS_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_PERSONALITY_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_SIGNALFD_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_READLINE_READLINE_H 1
| #define HAVE_READLINE_HISTORY_H 1
| #define PG_PRINTF_ATTRIBUTE printf
| #define HAVE__STATIC_ASSERT 1
| #define HAVE_TYPEOF 1
| #define typeof __typeof__
| #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
| #define HAVE__BUILTIN_CONSTANT_P 1
| #define HAVE__BUILTIN_OP_OVERFLOW 1
| #define HAVE__BUILTIN_UNREACHABLE 1
| #define HAVE_COMPUTED_GOTO 1
| #define HAVE_STRUCT_TM_TM_ZONE 1
| #define HAVE_SOCKLEN_T 1
| #define restrict __restrict
| #define pg_restrict __restrict
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_X86_64_POPCNTQ 1
| #define SIZEOF_OFF_T 8
| #define HAVE_INT_TIMEZONE 1
| #define HAVE_BACKTRACE_SYMBOLS 1
| #define HAVE_COPY_FILE_RANGE 1
| #define HAVE_GETAUXVAL 1
| #define HAVE_GETIFADDRS 1
| #define HAVE_INET_PTON 1
| #define HAVE_POSIX_FALLOCATE 1
| #define HAVE_PPOLL 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_SYNCFS 1
| #define HAVE_SYNC_FILE_RANGE 1
| #define HAVE_USELOCALE 1
| #define HAVE__BUILTIN_BSWAP16 1
| #define HAVE__BUILTIN_BSWAP32 1
| #define HAVE__BUILTIN_BSWAP64 1
| #define HAVE__BUILTIN_CLZ 1
| #define HAVE__BUILTIN_CTZ 1
| #define HAVE__BUILTIN_POPCOUNT 1
| #define HAVE__BUILTIN_FRAME_ADDRESS 1
| #define HAVE_FSEEKO 1
| #define HAVE_POSIX_FADVISE 1
| #define HAVE_DECL_POSIX_FADVISE 1
| #define HAVE_DECL_FDATASYNC 1
| #define HAVE_DECL_STRLCAT 0
| #define HAVE_DECL_STRLCPY 0
| #define HAVE_DECL_STRNLEN 1
| #define HAVE_DECL_STRSEP 1
| #define HAVE_DECL_TIMINGSAFE_BCMP 0
| #define HAVE_DECL_PREADV 1
| #define HAVE_DECL_PWRITEV 1
| #define HAVE_DECL_STRCHRNUL 1
| #define HAVE_DECL_MEMSET_S 0
| #define HAVE_DECL_F_FULLFSYNC 0
| #define HAVE_EXPLICIT_BZERO 1
| #define HAVE_GETOPT 1
| #define HAVE_INET_ATON 1
| #define HAVE_MKDTEMP 1
| #define HAVE_STRNLEN 1
| #define HAVE_STRSEP 1
| #define HAVE_PTHREAD_BARRIER_WAIT 1
| #define HAVE_GETOPT_LONG 1
| #define HAVE_SYSLOG 1
| #define HAVE_INT_OPTERR 1
| #define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
| #define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
| #define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
| #define HAVE_APPEND_HISTORY 1
| #define HAVE_HISTORY_TRUNCATE_FILE 1
| #define HAVE_RL_COMPLETION_MATCHES 1
| #define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
| #define HAVE_RL_RESET_SCREEN_SIZE 1
| #define HAVE_RL_VARIABLE_BIND 1
| #define SIZEOF_VOID_P 8
| #define SIZEOF_SIZE_T 8
| #define SIZEOF_LONG 8
| #define SIZEOF_LONG_LONG 8
| #define ALIGNOF_SHORT 2
| #define ALIGNOF_INT 4
| #define ALIGNOF_LONG 8
| #define ALIGNOF_INT64_T 8
| #define ALIGNOF_DOUBLE 8
| #define MAXIMUM_ALIGNOF 8
| #define PG_INT128_TYPE __int128
| #define ALIGNOF_PG_INT128_TYPE 16
| #define HAVE_GCC__SYNC_CHAR_TAS 1
| #define HAVE_GCC__SYNC_INT32_TAS 1
| #define HAVE_GCC__SYNC_INT32_CAS 1
| #define HAVE_GCC__SYNC_INT64_CAS 1
| #define HAVE_GCC__ATOMIC_INT32_CAS 1
| #define HAVE_GCC__ATOMIC_INT64_CAS 1
| #define HAVE__GET_CPUID 1
| #define HAVE__GET_CPUID_COUNT 1
| #define HAVE_XSAVE_INTRINSICS 1
| #define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1
| /* end confdefs.h. */
| unsigned int crc;
| int
| main ()
| {
| crc = __builtin_loongarch_crcc_w_b_w(0, crc);
| crc = __builtin_loongarch_crcc_w_h_w(0, crc);
| crc = __builtin_loongarch_crcc_w_w_w(0, crc);
| crc = __builtin_loongarch_crcc_w_d_w(0, crc);
| /* return computed value, to prevent the above being optimized away */
| return crc == 0;
| ;
| return 0;
| }
configure:18053: result: no
configure:18123: checking which CRC-32C implementation to use
configure:18138: result: SSE 4.2 with runtime check
configure:18181: checking for _mm512_clmulepi64_epi128
configure:18215: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:144:13: warning: no previous extern declaration for non-static variable 'x' [-Wmissing-variable-declarations]
144 | __m512i x;
| ^
conftest.c:144:5: note: declare 'static' if the variable is not intended to be used outside of this translation unit
144 | __m512i x;
| ^
conftest.c:145:13: warning: no previous extern declaration for non-static variable 'y' [-Wmissing-variable-declarations]
145 | __m512i y;
| ^
conftest.c:145:5: note: declare 'static' if the variable is not intended to be used outside of this translation unit
145 | __m512i y;
| ^
2 warnings generated.
configure:18215: $? = 0
configure:18223: result: yes
configure:18231: checking for vectorized CRC-32C
configure:18237: result: AVX-512 with runtime check
configure:18307: checking for library containing sem_init
configure:18338: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
configure:18338: $? = 0
configure:18355: result: none required
configure:18364: checking which semaphore API to use
configure:18387: result: unnamed POSIX
configure:18413: checking which random number source to use
configure:18425: result: /dev/urandom
configure:18427: checking for /dev/urandom
configure:18440: result: yes
configure:18913: checking for xmllint
configure:18931: found /usr/bin/xmllint
configure:18943: result: /usr/bin/xmllint
configure:18967: checking for xsltproc
configure:19000: result: no
configure:19021: checking for fop
configure:19054: result: no
configure:19075: checking for dbtoepub
configure:19108: result: no
configure:19135: checking for prove
configure:19153: found /usr/bin/prove
configure:19165: result: /usr/bin/prove
configure:19190: checking for Perl modules required for TAP tests
# IPC::Run::VERSION: 20180523.0
# Test::More::VERSION: 1.302162
# Time::HiRes::VERSION: 1.976
configure:19197: result: yes
configure:19300: checking whether clang supports -Wl,--as-needed, for LDFLAGS
configure:19321: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 -Wl,--as-needed conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:147:33: warning: no previous extern declaration for non-static variable 'fptr' [-Wmissing-variable-declarations]
147 | extern void readline (); void (*fptr) () = readline;
| ^
conftest.c:147:26: note: declare 'static' if the variable is not intended to be used outside of this translation unit
147 | extern void readline (); void (*fptr) () = readline;
| ^
1 warning generated.
configure:19321: $? = 0
configure:19321: ./conftest
configure:19321: $? = 0
configure:19332: result: yes
configure:19347: checking whether clang supports -Wl,--export-dynamic, for LDFLAGS_EX_BE
configure:19368: clang -o conftest -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds -D_GNU_SOURCE -I/usr/include/libxml2 -Wl,--as-needed -Wl,--export-dynamic conftest.c -lxml2 -lz -lreadline -lpthread -lrt -ldl -lm >&5
conftest.c:147:33: warning: no previous extern declaration for non-static variable 'fptr' [-Wmissing-variable-declarations]
147 | extern void readline (); void (*fptr) () = readline;
| ^
conftest.c:147:26: note: declare 'static' if the variable is not intended to be used outside of this translation unit
147 | extern void readline (); void (*fptr) () = readline;
| ^
1 warning generated.
configure:19368: $? = 0
configure:19368: ./conftest
configure:19368: $? = 0
configure:19379: result: yes
configure:19468: using compiler=clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256)
configure:19470: using CFLAGS=-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds
configure:19472: using CPPFLAGS= -D_GNU_SOURCE -I/usr/include/libxml2
configure:19474: using LDFLAGS= -Wl,--as-needed
configure:19634: creating ./config.status
## ---------------------- ##
## Running config.status. ##
## ---------------------- ##
This file was extended by PostgreSQL config.status 18beta1, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES =
CONFIG_HEADERS =
CONFIG_LINKS =
CONFIG_COMMANDS =
$ ./config.status
on lovely-coding
config.status:1111: creating GNUmakefile
config.status:1111: creating src/Makefile.global
config.status:1111: creating src/include/pg_config.h
config.status:1111: creating src/interfaces/ecpg/include/ecpg_config.h
config.status:1292: src/interfaces/ecpg/include/ecpg_config.h is unchanged
config.status:1318: linking src/backend/port/tas/dummy.s to src/backend/port/tas.s
config.status:1318: linking src/backend/port/posix_sema.c to src/backend/port/pg_sema.c
config.status:1318: linking src/backend/port/sysv_shmem.c to src/backend/port/pg_shmem.c
config.status:1318: linking src/include/port/linux.h to src/include/pg_config_os.h
config.status:1318: linking src/makefiles/Makefile.linux to src/Makefile.port
## ---------------- ##
## Cache variables. ##
## ---------------- ##
ac_cv_alignof_PG_INT128_TYPE=16
ac_cv_alignof_double=8
ac_cv_alignof_int64_t=8
ac_cv_alignof_int=4
ac_cv_alignof_long=8
ac_cv_alignof_short=2
ac_cv_build=x86_64-pc-linux-gnu
ac_cv_c_bigendian=no
ac_cv_c_compiler_gnu=yes
ac_cv_c_decl_report=error
ac_cv_c_inline=inline
ac_cv_c_restrict=__restrict
ac_cv_cxx_compiler_gnu=yes
ac_cv_env_CCC_set=
ac_cv_env_CCC_value=
ac_cv_env_CC_set=set
ac_cv_env_CC_value=clang
ac_cv_env_CFLAGS_set=set
ac_cv_env_CFLAGS_value='-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds '
ac_cv_env_CLANG_set=
ac_cv_env_CLANG_value=
ac_cv_env_CPPFLAGS_set=
ac_cv_env_CPPFLAGS_value=
ac_cv_env_CPP_set=
ac_cv_env_CPP_value=
ac_cv_env_CXXFLAGS_set=
ac_cv_env_CXXFLAGS_value=
ac_cv_env_CXX_set=set
ac_cv_env_CXX_value=clang++
ac_cv_env_ICU_CFLAGS_set=
ac_cv_env_ICU_CFLAGS_value=
ac_cv_env_ICU_LIBS_set=
ac_cv_env_ICU_LIBS_value=
ac_cv_env_LDFLAGS_EX_set=
ac_cv_env_LDFLAGS_EX_value=
ac_cv_env_LDFLAGS_SL_set=
ac_cv_env_LDFLAGS_SL_value=
ac_cv_env_LDFLAGS_set=
ac_cv_env_LDFLAGS_value=
ac_cv_env_LIBCURL_CFLAGS_set=
ac_cv_env_LIBCURL_CFLAGS_value=
ac_cv_env_LIBCURL_LIBS_set=
ac_cv_env_LIBCURL_LIBS_value=
ac_cv_env_LIBNUMA_CFLAGS_set=
ac_cv_env_LIBNUMA_CFLAGS_value=
ac_cv_env_LIBNUMA_LIBS_set=
ac_cv_env_LIBNUMA_LIBS_value=
ac_cv_env_LIBS_set=
ac_cv_env_LIBS_value=
ac_cv_env_LIBURING_CFLAGS_set=
ac_cv_env_LIBURING_CFLAGS_value=
ac_cv_env_LIBURING_LIBS_set=
ac_cv_env_LIBURING_LIBS_value=
ac_cv_env_LLVM_CONFIG_set=
ac_cv_env_LLVM_CONFIG_value=
ac_cv_env_LZ4_CFLAGS_set=
ac_cv_env_LZ4_CFLAGS_value=
ac_cv_env_LZ4_LIBS_set=
ac_cv_env_LZ4_LIBS_value=
ac_cv_env_MSGFMT_set=
ac_cv_env_MSGFMT_value=
ac_cv_env_PERL_set=
ac_cv_env_PERL_value=
ac_cv_env_PG_TEST_EXTRA_set=
ac_cv_env_PG_TEST_EXTRA_value=
ac_cv_env_PKG_CONFIG_LIBDIR_set=
ac_cv_env_PKG_CONFIG_LIBDIR_value=
ac_cv_env_PKG_CONFIG_PATH_set=
ac_cv_env_PKG_CONFIG_PATH_value=
ac_cv_env_PKG_CONFIG_set=
ac_cv_env_PKG_CONFIG_value=
ac_cv_env_PYTHON_set=
ac_cv_env_PYTHON_value=
ac_cv_env_TCLSH_set=
ac_cv_env_TCLSH_value=
ac_cv_env_XML2_CFLAGS_set=
ac_cv_env_XML2_CFLAGS_value=
ac_cv_env_XML2_CONFIG_set=
ac_cv_env_XML2_CONFIG_value=
ac_cv_env_XML2_LIBS_set=
ac_cv_env_XML2_LIBS_value=
ac_cv_env_ZSTD_CFLAGS_set=
ac_cv_env_ZSTD_CFLAGS_value=
ac_cv_env_ZSTD_LIBS_set=
ac_cv_env_ZSTD_LIBS_value=
ac_cv_env_build_alias_set=
ac_cv_env_build_alias_value=
ac_cv_env_host_alias_set=
ac_cv_env_host_alias_value=
ac_cv_env_target_alias_set=
ac_cv_env_target_alias_value=
ac_cv_file__dev_urandom=yes
ac_cv_func_append_history=yes
ac_cv_func_backtrace_symbols=yes
ac_cv_func_copy_file_range=yes
ac_cv_func_copyfile=no
ac_cv_func_elf_aux_info=no
ac_cv_func_explicit_bzero=yes
ac_cv_func_getauxval=yes
ac_cv_func_getifaddrs=yes
ac_cv_func_getopt=yes
ac_cv_func_getopt_long=yes
ac_cv_func_getpeereid=no
ac_cv_func_getpeerucred=no
ac_cv_func_history_truncate_file=yes
ac_cv_func_inet_aton=yes
ac_cv_func_inet_pton=yes
ac_cv_func_kqueue=no
ac_cv_func_localeconv_l=no
ac_cv_func_mbstowcs_l=no
ac_cv_func_mkdtemp=yes
ac_cv_func_posix_fadvise=yes
ac_cv_func_posix_fallocate=yes
ac_cv_func_ppoll=yes
ac_cv_func_pthread_barrier_wait=yes
ac_cv_func_pthread_is_threaded_np=no
ac_cv_func_rl_completion_matches=yes
ac_cv_func_rl_filename_completion_function=yes
ac_cv_func_rl_reset_screen_size=yes
ac_cv_func_rl_variable_bind=yes
ac_cv_func_setproctitle=no
ac_cv_func_setproctitle_fast=no
ac_cv_func_strerror_r=yes
ac_cv_func_strlcat=no
ac_cv_func_strlcpy=no
ac_cv_func_strnlen=yes
ac_cv_func_strsep=yes
ac_cv_func_strsignal=yes
ac_cv_func_sync_file_range=yes
ac_cv_func_syncfs=yes
ac_cv_func_syslog=yes
ac_cv_func_timingsafe_bcmp=no
ac_cv_func_uselocale=yes
ac_cv_func_wcstombs_l=no
ac_cv_have_decl_F_FULLFSYNC=no
ac_cv_have_decl_fdatasync=yes
ac_cv_have_decl_memset_s=no
ac_cv_have_decl_posix_fadvise=yes
ac_cv_have_decl_preadv=yes
ac_cv_have_decl_pwritev=yes
ac_cv_have_decl_strchrnul=yes
ac_cv_have_decl_strlcat=no
ac_cv_have_decl_strlcpy=no
ac_cv_have_decl_strnlen=yes
ac_cv_have_decl_strsep=yes
ac_cv_have_decl_timingsafe_bcmp=no
ac_cv_header_atomic_h=no
ac_cv_header_copyfile_h=no
ac_cv_header_execinfo_h=yes
ac_cv_header_getopt_h=yes
ac_cv_header_ifaddrs_h=yes
ac_cv_header_inttypes_h=yes
ac_cv_header_libxml_parser_h=yes
ac_cv_header_mbarrier_h=no
ac_cv_header_memory_h=yes
ac_cv_header_pthread_h=yes
ac_cv_header_readline_history_h=yes
ac_cv_header_readline_readline_h=yes
ac_cv_header_stdc=yes
ac_cv_header_stdint_h=yes
ac_cv_header_stdlib_h=yes
ac_cv_header_string_h=yes
ac_cv_header_strings_h=yes
ac_cv_header_sys_epoll_h=yes
ac_cv_header_sys_event_h=no
ac_cv_header_sys_personality_h=yes
ac_cv_header_sys_prctl_h=yes
ac_cv_header_sys_procctl_h=no
ac_cv_header_sys_signalfd_h=yes
ac_cv_header_sys_stat_h=yes
ac_cv_header_sys_types_h=yes
ac_cv_header_sys_ucred_h=no
ac_cv_header_syslog_h=yes
ac_cv_header_termios_h=yes
ac_cv_header_ucred_h=no
ac_cv_header_unicode_ucol_h=yes
ac_cv_header_unistd_h=yes
ac_cv_header_xlocale_h=no
ac_cv_header_zlib_h=yes
ac_cv_host=x86_64-pc-linux-gnu
ac_cv_lib_m_main=yes
ac_cv_lib_xml2_xmlSaveToBuffer=yes
ac_cv_lib_z_inflate=yes
ac_cv_member_struct_sockaddr_sa_len=no
ac_cv_member_struct_tm_tm_zone=yes
ac_cv_objext=o
ac_cv_path_BISON=/usr/bin/bison
ac_cv_path_EGREP='/usr/bin/grep -E'
ac_cv_path_FLEX=/usr/bin/flex
ac_cv_path_GREP=/usr/bin/grep
ac_cv_path_OPENSSL=/usr/bin/openssl
ac_cv_path_PERL=/usr/bin/perl
ac_cv_path_PROVE=/usr/bin/prove
ac_cv_path_SED=/usr/bin/sed
ac_cv_path_TAR=/usr/bin/tar
ac_cv_path_XMLLINT=/usr/bin/xmllint
ac_cv_path_ZSTD=/usr/bin/zstd
ac_cv_path_ac_pt_PKG_CONFIG=/usr/bin/pkg-config
ac_cv_path_install='/usr/bin/install -c'
ac_cv_path_mkdir=/usr/bin/mkdir
ac_cv_prog_AWK=mawk
ac_cv_prog_CPP='clang -E'
ac_cv_prog_ac_ct_AR=ar
ac_cv_prog_ac_ct_STRIP=strip
ac_cv_prog_cc_c89=
ac_cv_prog_cc_c99=
ac_cv_prog_cc_g=yes
ac_cv_prog_cxx_g=yes
ac_cv_search_backtrace_symbols='none required'
ac_cv_search_clock_gettime='none required'
ac_cv_search_dlsym=-ldl
ac_cv_search_getopt_long='none required'
ac_cv_search_pthread_barrier_wait=-lpthread
ac_cv_search_sem_init='none required'
ac_cv_search_setproctitle=no
ac_cv_search_shm_open=-lrt
ac_cv_search_shm_unlink='none required'
ac_cv_search_shmget='none required'
ac_cv_search_socket='none required'
ac_cv_sizeof_long=8
ac_cv_sizeof_long_long=8
ac_cv_sizeof_off_t=8
ac_cv_sizeof_size_t=8
ac_cv_sizeof_void_p=8
ac_cv_sys_file_offset_bits=no
ac_cv_sys_largefile_CC=no
ac_cv_sys_largefile_source=no
ac_cv_type_socklen_t=yes
ac_cv_type_struct_option=yes
ac_cv_type_union_semun=no
ax_cv_PTHREAD_CLANG=yes
ax_cv_PTHREAD_CLANG_NO_WARN_FLAG=no
ax_cv_PTHREAD_JOINABLE_ATTR=PTHREAD_CREATE_JOINABLE
ax_cv_PTHREAD_PRIO_INHERIT=yes
ax_cv_PTHREAD_SPECIAL_FLAGS=no
pgac_cv__128bit_int=yes
pgac_cv__128bit_int_bug=ok
pgac_cv__builtin_bswap16=yes
pgac_cv__builtin_bswap32=yes
pgac_cv__builtin_bswap64=yes
pgac_cv__builtin_clz=yes
pgac_cv__builtin_constant_p=yes
pgac_cv__builtin_ctz=yes
pgac_cv__builtin_frame_address=yes
pgac_cv__builtin_op_overflow=yes
pgac_cv__builtin_popcount=yes
pgac_cv__builtin_unreachable=yes
pgac_cv__cpuid=no
pgac_cv__cpuidex=no
pgac_cv__get_cpuid=yes
pgac_cv__get_cpuid_count=yes
pgac_cv__static_assert=yes
pgac_cv__types_compatible=yes
pgac_cv_armv8_crc32c_intrinsics_=no
pgac_cv_armv8_crc32c_intrinsics__march_armv8_apcrc=no
pgac_cv_armv8_crc32c_intrinsics__march_armv8_apcrcpsimd=no
pgac_cv_avx512_pclmul_intrinsics=yes
pgac_cv_avx512_popcnt_intrinsics=yes
pgac_cv_c_typeof=__typeof__
pgac_cv_check_readline=-lreadline
pgac_cv_computed_goto=yes
pgac_cv_func_strerror_r_int=no
pgac_cv_gcc_atomic_int32_cas=yes
pgac_cv_gcc_atomic_int64_cas=yes
pgac_cv_gcc_sync_char_tas=yes
pgac_cv_gcc_sync_int32_cas=yes
pgac_cv_gcc_sync_int32_tas=yes
pgac_cv_gcc_sync_int64_cas=yes
pgac_cv_have_x86_64_popcntq=yes
pgac_cv_loongarch_crc32c_intrinsics=no
pgac_cv_printf_archetype=printf
pgac_cv_prog_CC_cflags__Wcast_function_type=yes
pgac_cv_prog_CC_cflags__Wcast_function_type_strict=yes
pgac_cv_prog_CC_cflags__Wcompound_token_split_by_macro=yes
pgac_cv_prog_CC_cflags__Wdeclaration_after_statement=yes
pgac_cv_prog_CC_cflags__Wendif_labels=yes
pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new=yes
pgac_cv_prog_CC_cflags__Werror_vla=yes
pgac_cv_prog_CC_cflags__Wformat_security=yes
pgac_cv_prog_CC_cflags__Wformat_truncation=yes
pgac_cv_prog_CC_cflags__Wimplicit_fallthrough_3=no
pgac_cv_prog_CC_cflags__Wmissing_format_attribute=yes
pgac_cv_prog_CC_cflags__Wmissing_variable_declarations=yes
pgac_cv_prog_CC_cflags__Wshadow_compatible_local=no
pgac_cv_prog_CC_cflags__Wstringop_truncation=no
pgac_cv_prog_CC_cflags__Wunused_command_line_argument=yes
pgac_cv_prog_CC_cflags__fexcess_precision_standard=yes
pgac_cv_prog_CC_cflags__fno_strict_aliasing=yes
pgac_cv_prog_CC_cflags__ftree_vectorize=yes
pgac_cv_prog_CC_cflags__funroll_loops=yes
pgac_cv_prog_CC_cflags__fvisibility_hidden=yes
pgac_cv_prog_CC_cflags__fwrapv=yes
pgac_cv_prog_CXX_cxxflags__Wcast_function_type=yes
pgac_cv_prog_CXX_cxxflags__Wendif_labels=yes
pgac_cv_prog_CXX_cxxflags__Werror_unguarded_availability_new=yes
pgac_cv_prog_CXX_cxxflags__Wformat_security=yes
pgac_cv_prog_CXX_cxxflags__Wimplicit_fallthrough_3=no
pgac_cv_prog_CXX_cxxflags__Wmissing_format_attribute=yes
pgac_cv_prog_CXX_cxxflags__Wshadow_compatible_local=no
pgac_cv_prog_CXX_cxxflags__fexcess_precision_standard=yes
pgac_cv_prog_CXX_cxxflags__fno_strict_aliasing=yes
pgac_cv_prog_CXX_cxxflags__fvisibility_hidden=yes
pgac_cv_prog_CXX_cxxflags__fvisibility_inlines_hidden=yes
pgac_cv_prog_CXX_cxxflags__fwrapv=yes
pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic=yes
pgac_cv_prog_cc_LDFLAGS__Wl___as_needed=yes
pgac_cv_sse42_crc32_intrinsics=yes
pgac_cv_var_int_opterr=yes
pgac_cv_var_int_optreset=no
pgac_cv_var_int_timezone=yes
pgac_cv_var_rl_completion_suppress_quote=yes
pgac_cv_var_rl_filename_quote_characters=yes
pgac_cv_var_rl_filename_quoting_function=yes
pgac_cv_xsave_intrinsics=yes
pkg_cv_ICU_CFLAGS=
pkg_cv_ICU_LIBS='-licui18n -licuuc -licudata'
pkg_cv_XML2_CFLAGS=-I/usr/include/libxml2
pkg_cv_XML2_LIBS=-lxml2
## ----------------- ##
## Output variables. ##
## ----------------- ##
AR='ar'
AWK='mawk'
BISON='/usr/bin/bison'
BISONFLAGS=' -Wno-deprecated'
BITCODE_CFLAGS=' -O2 '
BITCODE_CXXFLAGS=' -O2 '
CC='clang'
CFLAGS='-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation -Wno-cast-function-type-strict -g -O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds '
CFLAGS_CRC=''
CFLAGS_SL='-fPIC'
CFLAGS_SL_MODULE=' -fvisibility=hidden'
CFLAGS_UNROLL_LOOPS=' -funroll-loops'
CFLAGS_VECTORIZE=' -ftree-vectorize'
CLANG=''
CPP='clang -E'
CPPFLAGS=' -D_GNU_SOURCE -I/usr/include/libxml2 '
CXX='clang++'
CXXFLAGS='-Wall -Wpointer-arith -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2'
CXXFLAGS_SL_MODULE=' -fvisibility=hidden -fvisibility-inlines-hidden'
DBTOEPUB=''
DEFS='-DHAVE_CONFIG_H'
DLSUFFIX='.so'
DTRACE=''
DTRACEFLAGS=''
ECHO_C=''
ECHO_N='-n'
ECHO_T=''
EGREP='/usr/bin/grep -E'
EXEEXT=''
FLEX='/usr/bin/flex'
FLEXFLAGS=''
FOP=''
GCC='yes'
GCOV=''
GENHTML=''
GREP='/usr/bin/grep'
ICU_CFLAGS=''
ICU_LIBS='-licui18n -licuuc -licudata'
INSTALL_DATA='${INSTALL} -m 644'
INSTALL_PROGRAM='${INSTALL}'
INSTALL_SCRIPT='${INSTALL}'
LCOV=''
LDAP_LIBS_BE=''
LDAP_LIBS_FE=''
LDFLAGS=' -Wl,--as-needed'
LDFLAGS_EX=''
LDFLAGS_EX_BE=' -Wl,--export-dynamic'
LDFLAGS_SL=''
LIBCURL_CFLAGS=''
LIBCURL_CPPFLAGS=''
LIBCURL_LDFLAGS=''
LIBCURL_LDLIBS=''
LIBCURL_LIBS=''
LIBNUMA_CFLAGS=''
LIBNUMA_LIBS=''
LIBOBJS=' ${LIBOBJDIR}getpeereid$U.o ${LIBOBJDIR}strlcat$U.o ${LIBOBJDIR}strlcpy$U.o ${LIBOBJDIR}timingsafe_bcmp$U.o'
LIBS='-lxml2 -lz -lreadline -lpthread -lrt -ldl -lm '
LIBURING_CFLAGS=''
LIBURING_LIBS=''
LLVM_BINPATH=''
LLVM_CFLAGS=''
LLVM_CONFIG=''
LLVM_CPPFLAGS=''
LLVM_CXXFLAGS=''
LLVM_LIBS=''
LN_S='ln -s'
LTLIBOBJS=' ${LIBOBJDIR}getpeereid$U.lo ${LIBOBJDIR}strlcat$U.lo ${LIBOBJDIR}strlcpy$U.lo ${LIBOBJDIR}timingsafe_bcmp$U.lo'
LZ4=''
LZ4_CFLAGS=''
LZ4_LIBS=''
MKDIR_P='/usr/bin/mkdir -p'
MSGFMT=''
MSGFMT_FLAGS=''
MSGMERGE=''
OBJEXT='o'
OPENSSL='/usr/bin/openssl'
PACKAGE_BUGREPORT='pgsql-bugs@lists.postgresql.org'
PACKAGE_NAME='PostgreSQL'
PACKAGE_STRING='PostgreSQL 18beta1'
PACKAGE_TARNAME='postgresql'
PACKAGE_URL='https://www.postgresql.org/'
PACKAGE_VERSION='18beta1'
PATH_SEPARATOR=':'
PERL='/usr/bin/perl'
PERMIT_DECLARATION_AFTER_STATEMENT='-Wno-declaration-after-statement'
PERMIT_MISSING_VARIABLE_DECLARATIONS='-Wno-missing-variable-declarations'
PG_CRC32C_OBJS='pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o'
PG_MAJORVERSION='18'
PG_SYSROOT=''
PG_TEST_EXTRA=''
PG_VERSION_NUM='180000'
PKG_CONFIG='/usr/bin/pkg-config'
PKG_CONFIG_LIBDIR=''
PKG_CONFIG_PATH=''
PORTNAME='linux'
PROVE='/usr/bin/prove'
PTHREAD_CC='clang'
PTHREAD_CFLAGS='-pthread -D_REENTRANT -D_THREAD_SAFE'
PTHREAD_LIBS=''
PYTHON=''
SED='/usr/bin/sed'
SHELL='/bin/bash'
STRIP='strip'
STRIP_SHARED_LIB='strip --strip-unneeded'
STRIP_STATIC_LIB='strip --strip-unneeded'
SUN_STUDIO_CC='no'
TAR='/usr/bin/tar'
TAS=''
TCLSH=''
TCL_CONFIG_SH=''
TCL_INCLUDE_SPEC=''
TCL_LIBS=''
TCL_LIB_SPEC=''
TCL_SHARED_BUILD=''
UUID_LIBS=''
WANTED_LANGUAGES=''
WINDRES=''
XGETTEXT=''
XML2_CFLAGS='-I/usr/include/libxml2'
XML2_CONFIG=''
XML2_LIBS='-lxml2'
XMLLINT='/usr/bin/xmllint'
XSLTPROC=''
ZIC=''
ZSTD='/usr/bin/zstd'
ZSTD_CFLAGS=''
ZSTD_LIBS=''
ac_ct_CC=''
ac_ct_CXX=''
autodepend=''
ax_pthread_config=''
bindir='${exec_prefix}/bin'
build='x86_64-pc-linux-gnu'
build_alias=''
build_cpu='x86_64'
build_os='linux-gnu'
build_vendor='pc'
datadir='${datarootdir}'
datarootdir='${prefix}/share'
default_port='7432'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
dvidir='${docdir}'
enable_coverage='no'
enable_debug='yes'
enable_dtrace='no'
enable_injection_points='no'
enable_nls='no'
enable_rpath='yes'
enable_tap_tests='yes'
exec_prefix='${prefix}'
host='x86_64-pc-linux-gnu'
host_alias=''
host_cpu='x86_64'
host_os='linux-gnu'
host_vendor='pc'
htmldir='${docdir}'
includedir='${prefix}/include'
infodir='${datarootdir}/info'
install_bin='/usr/bin/install -c'
krb_srvtab=''
libdir='${exec_prefix}/lib'
libexecdir='${exec_prefix}/libexec'
localedir='${datarootdir}/locale'
localstatedir='${prefix}/var'
mandir='${datarootdir}/man'
oldincludedir='/usr/include'
pdfdir='${docdir}'
perl_archlibexp=''
perl_embed_ccflags=''
perl_embed_ldflags=''
perl_includespec=''
perl_privlibexp=''
perl_useshrplib=''
prefix='/u01/yizhi/bin/postgres'
program_transform_name='s,x,x,'
psdir='${docdir}'
python_additional_libs=''
python_includespec=''
python_libdir=''
python_libspec=''
python_majorversion=''
python_version=''
sbindir='${exec_prefix}/sbin'
sharedstatedir='${prefix}/com'
sysconfdir='${prefix}/etc'
target_alias=''
vpath_build='no'
with_gssapi='no'
with_icu='yes'
with_krb_srvnam='postgres'
with_ldap='no'
with_libcurl='no'
with_libnuma='no'
with_liburing='no'
with_libxml='yes'
with_libxslt='no'
with_llvm='no'
with_lz4='no'
with_perl='no'
with_python='no'
with_readline='yes'
with_selinux='no'
with_ssl='no'
with_system_tzdata=''
with_systemd='no'
with_tcl='no'
with_uuid='no'
with_zlib='yes'
with_zstd='no'
## ----------- ##
## confdefs.h. ##
## ----------- ##
/* confdefs.h */
#define PACKAGE_NAME "PostgreSQL"
#define PACKAGE_TARNAME "postgresql"
#define PACKAGE_VERSION "18beta1"
#define PACKAGE_STRING "PostgreSQL 18beta1"
#define PACKAGE_BUGREPORT "pgsql-bugs@lists.postgresql.org"
#define PACKAGE_URL "https://www.postgresql.org/"
#define CONFIGURE_ARGS " '--with-pgport=7432' '--prefix=/u01/yizhi/bin/postgres/' '--enable-debug' '--with-libxml' '--enable-tap-tests' '--enable-cassert' 'CC=clang' 'CFLAGS=-O0 -Wall -fno-omit-frame-pointer -std=c11 -Wno-unused-parameter -Wno-sign-compare -Wno-missing-field-initializers -Wno-array-bounds ' 'CXX=clang++'"
#define PG_MAJORVERSION "18"
#define PG_MAJORVERSION_NUM 18
#define PG_MINORVERSION_NUM 0
#define PG_VERSION "18beta1"
#define DEF_PGPORT 7432
#define DEF_PGPORT_STR "7432"
#define BLCKSZ 8192
#define RELSEG_SIZE 131072
#define XLOG_BLCKSZ 8192
#define HAVE_VISIBILITY_ATTRIBUTE 1
#define DLSUFFIX ".so"
#define USE_ASSERT_CHECKING 1
#define USE_ICU 1
#define PG_KRB_SRVNAM "postgres"
#define USE_LIBXML 1
#define STDC_HEADERS 1
#define HAVE_SYS_TYPES_H 1
#define HAVE_SYS_STAT_H 1
#define HAVE_STDLIB_H 1
#define HAVE_STRING_H 1
#define HAVE_MEMORY_H 1
#define HAVE_STRINGS_H 1
#define HAVE_INTTYPES_H 1
#define HAVE_STDINT_H 1
#define HAVE_UNISTD_H 1
#define HAVE_PTHREAD_PRIO_INHERIT 1
#define HAVE_PTHREAD 1
#define HAVE_STRERROR_R 1
#define HAVE_LIBM 1
#define HAVE_LIBREADLINE 1
#define HAVE_LIBZ 1
#define HAVE_LIBXML2 1
#define HAVE_EXECINFO_H 1
#define HAVE_GETOPT_H 1
#define HAVE_IFADDRS_H 1
#define HAVE_SYS_EPOLL_H 1
#define HAVE_SYS_PERSONALITY_H 1
#define HAVE_SYS_PRCTL_H 1
#define HAVE_SYS_SIGNALFD_H 1
#define HAVE_TERMIOS_H 1
#define HAVE_READLINE_READLINE_H 1
#define HAVE_READLINE_HISTORY_H 1
#define PG_PRINTF_ATTRIBUTE printf
#define HAVE__STATIC_ASSERT 1
#define HAVE_TYPEOF 1
#define typeof __typeof__
#define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1
#define HAVE__BUILTIN_CONSTANT_P 1
#define HAVE__BUILTIN_OP_OVERFLOW 1
#define HAVE__BUILTIN_UNREACHABLE 1
#define HAVE_COMPUTED_GOTO 1
#define HAVE_STRUCT_TM_TM_ZONE 1
#define HAVE_SOCKLEN_T 1
#define restrict __restrict
#define pg_restrict __restrict
#define HAVE_STRUCT_OPTION 1
#define HAVE_X86_64_POPCNTQ 1
#define SIZEOF_OFF_T 8
#define HAVE_INT_TIMEZONE 1
#define HAVE_BACKTRACE_SYMBOLS 1
#define HAVE_COPY_FILE_RANGE 1
#define HAVE_GETAUXVAL 1
#define HAVE_GETIFADDRS 1
#define HAVE_INET_PTON 1
#define HAVE_POSIX_FALLOCATE 1
#define HAVE_PPOLL 1
#define HAVE_STRSIGNAL 1
#define HAVE_SYNCFS 1
#define HAVE_SYNC_FILE_RANGE 1
#define HAVE_USELOCALE 1
#define HAVE__BUILTIN_BSWAP16 1
#define HAVE__BUILTIN_BSWAP32 1
#define HAVE__BUILTIN_BSWAP64 1
#define HAVE__BUILTIN_CLZ 1
#define HAVE__BUILTIN_CTZ 1
#define HAVE__BUILTIN_POPCOUNT 1
#define HAVE__BUILTIN_FRAME_ADDRESS 1
#define HAVE_FSEEKO 1
#define HAVE_POSIX_FADVISE 1
#define HAVE_DECL_POSIX_FADVISE 1
#define HAVE_DECL_FDATASYNC 1
#define HAVE_DECL_STRLCAT 0
#define HAVE_DECL_STRLCPY 0
#define HAVE_DECL_STRNLEN 1
#define HAVE_DECL_STRSEP 1
#define HAVE_DECL_TIMINGSAFE_BCMP 0
#define HAVE_DECL_PREADV 1
#define HAVE_DECL_PWRITEV 1
#define HAVE_DECL_STRCHRNUL 1
#define HAVE_DECL_MEMSET_S 0
#define HAVE_DECL_F_FULLFSYNC 0
#define HAVE_EXPLICIT_BZERO 1
#define HAVE_GETOPT 1
#define HAVE_INET_ATON 1
#define HAVE_MKDTEMP 1
#define HAVE_STRNLEN 1
#define HAVE_STRSEP 1
#define HAVE_PTHREAD_BARRIER_WAIT 1
#define HAVE_GETOPT_LONG 1
#define HAVE_SYSLOG 1
#define HAVE_INT_OPTERR 1
#define HAVE_RL_COMPLETION_SUPPRESS_QUOTE 1
#define HAVE_RL_FILENAME_QUOTE_CHARACTERS 1
#define HAVE_RL_FILENAME_QUOTING_FUNCTION 1
#define HAVE_APPEND_HISTORY 1
#define HAVE_HISTORY_TRUNCATE_FILE 1
#define HAVE_RL_COMPLETION_MATCHES 1
#define HAVE_RL_FILENAME_COMPLETION_FUNCTION 1
#define HAVE_RL_RESET_SCREEN_SIZE 1
#define HAVE_RL_VARIABLE_BIND 1
#define SIZEOF_VOID_P 8
#define SIZEOF_SIZE_T 8
#define SIZEOF_LONG 8
#define SIZEOF_LONG_LONG 8
#define ALIGNOF_SHORT 2
#define ALIGNOF_INT 4
#define ALIGNOF_LONG 8
#define ALIGNOF_INT64_T 8
#define ALIGNOF_DOUBLE 8
#define MAXIMUM_ALIGNOF 8
#define PG_INT128_TYPE __int128
#define ALIGNOF_PG_INT128_TYPE 16
#define HAVE_GCC__SYNC_CHAR_TAS 1
#define HAVE_GCC__SYNC_INT32_TAS 1
#define HAVE_GCC__SYNC_INT32_CAS 1
#define HAVE_GCC__SYNC_INT64_CAS 1
#define HAVE_GCC__ATOMIC_INT32_CAS 1
#define HAVE_GCC__ATOMIC_INT64_CAS 1
#define HAVE__GET_CPUID 1
#define HAVE__GET_CPUID_COUNT 1
#define HAVE_XSAVE_INTRINSICS 1
#define USE_AVX512_POPCNT_WITH_RUNTIME_CHECK 1
#define USE_SSE42_CRC32C_WITH_RUNTIME_CHECK 1
#define USE_AVX512_CRC32C_WITH_RUNTIME_CHECK 1
#define USE_UNNAMED_POSIX_SEMAPHORES 1
#define USE_SYSV_SHARED_MEMORY 1
#define MEMSET_LOOP_LIMIT 1024
#define PG_VERSION_STR "PostgreSQL 18beta1 on x86_64-pc-linux-gnu, compiled by clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256), 64-bit"
#define PG_VERSION_NUM 180000
configure: exit 0
Hi,
I suggest you try with a newer gcc, perhaps 13.4. There's been a bunch
of fixes related to AVX512 since 13.0, chances are this was already
fixed. I don't see this failure on 14.3.1.
T.
On 6/14/25 12:24, Andy Fan wrote:
Hi,
Recently I always get below error during initdb.
"""
UTC [1358059] FATAL: incorrect checksum in control file
"""the command is "initdb -D tmp". git bisect show me the related commit is
3c6e8c123896584f1be1fe69aaf68dcb5eb094d5. After revert this commit on
the current master, everything is fine. Does anyone knows the reason?The attached is my config.log.
--
Tomas Vondra
On Sat, Jun 14, 2025 at 03:47:33PM +0200, Tomas Vondra wrote:
I suggest you try with a newer gcc, perhaps 13.4. There's been a bunch
of fixes related to AVX512 since 13.0, chances are this was already
fixed. I don't see this failure on 14.3.1.
From the config.log, it looks like Andy is using clang:
configure:3998: clang --version >&5
clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256)
And I see -O0 used, too, which would match the existing report [0]/messages/by-id/CAE-ML+-OV6p9uvCFBcSQjZUEh__y0h-KjN+BseyGJHt7u8EP+w@mail.gmail.com,
although that report is for clang 19.1.7.
I'm also genuinely curious why folks are using -O0...
[0]: /messages/by-id/CAE-ML+-OV6p9uvCFBcSQjZUEh__y0h-KjN+BseyGJHt7u8EP+w@mail.gmail.com
--
nathan
On 6/14/25 15:56, Nathan Bossart wrote:
On Sat, Jun 14, 2025 at 03:47:33PM +0200, Tomas Vondra wrote:
I suggest you try with a newer gcc, perhaps 13.4. There's been a bunch
of fixes related to AVX512 since 13.0, chances are this was already
fixed. I don't see this failure on 14.3.1.From the config.log, it looks like Andy is using clang:
configure:3998: clang --version >&5
clang version 18.1.6 (https://gitee.com/mirrors/llvm-project.git 1118c2e05e67a36ed8ca250524525cdb66a55256)And I see -O0 used, too, which would match the existing report [0],
although that report is for clang 19.1.7.
Ah, I got confused by this:
-----------------
Found candidate GCC installation:
/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.3.0
Selected GCC installation:
/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.3.0
-----------------
I'm also genuinely curious why folks are using -O0...
[0] /messages/by-id/CAE-ML+-OV6p9uvCFBcSQjZUEh__y0h-KjN+BseyGJHt7u8EP+w@mail.gmail.com
I personally use -O0 to get better backtraces (without values optimized
out), better valgrind reports, etc.
--
Tomas Vondra
Hi Nathan,
On 6/14/25 9:56 AM, Nathan Bossart wrote:
I'm also genuinely curious why folks are using -O0...
Personally, I use
-O0 -fno-omit-frame-pointer
for FlameGraph [1]https://github.com/brendangregg/FlameGraph investigations.
[1]: https://github.com/brendangregg/FlameGraph
Best regards,
Jesper
Jesper Pedersen <jesperpedersen.db@gmail.com> writes:
Hi,
Thank you Nathan, Tomas and Jesper for the answers. The patch at [0]/messages/by-id/CANWCAZbsuavqUK4tg6UF-0-GVRMaq7BafUx4+Dyd12y=-AuFAA@mail.gmail.com
works for me and I could work with master smoothly now.
On 6/14/25 9:56 AM, Nathan Bossart wrote:
I'm also genuinely curious why folks are using -O0...
Personally, I use
-O0 -fno-omit-frame-pointer
for FlameGraph [1] investigations.
Same here. I use clang as compiler because I use clangd for code
indexing, gcc sometimes use different compiler options which may broke
it. I used '-O0' in my daily coding and only use '-O2' when doing some
performance testing.
[0]: /messages/by-id/CANWCAZbsuavqUK4tg6UF-0-GVRMaq7BafUx4+Dyd12y=-AuFAA@mail.gmail.com
/messages/by-id/CANWCAZbsuavqUK4tg6UF-0-GVRMaq7BafUx4+Dyd12y=-AuFAA@mail.gmail.com
--
Best Regards
Andy Fan
On Sun, Jun 15, 2025 at 8:32 AM Andy Fan <zhihuifan1213@163.com> wrote:
Jesper Pedersen <jesperpedersen.db@gmail.com> writes:
Hi,
Thank you Nathan, Tomas and Jesper for the answers. The patch at [0]
works for me and I could work with master smoothly now.
Pushed, thanks for testing! I'll do some more testing to see what
versions/levels are affected and file a bug report, but it'll be a few
days before I get to it.
--
John Naylor
Amazon Web Services
Attached is a simple reproducer. It passes with clang v16 -O0, but fails with 17 and 18 only when built with -O0.
Build command: clang main.c -O0
Hope this helps.
Raghuveer
Show quoted text
-----Original Message-----
From: John Naylor <johncnaylorls@gmail.com>
Sent: Sunday, June 15, 2025 7:39 PM
To: Andy Fan <zhihuifan1213@163.com>
Cc: Jesper Pedersen <jesperpedersen.db@gmail.com>; Nathan Bossart
<nathandbossart@gmail.com>; Tomas Vondra <tomas@vondra.me>; Devulapalli,
Raghuveer <raghuveer.devulapalli@intel.com>; pgsql-
hackers@lists.postgresql.org; Shankaran, Akash <akash.shankaran@intel.com>
Subject: Re: Improve CRC32C performance on SSE4.2On Sun, Jun 15, 2025 at 8:32 AM Andy Fan <zhihuifan1213@163.com> wrote:
Jesper Pedersen <jesperpedersen.db@gmail.com> writes:
Hi,
Thank you Nathan, Tomas and Jesper for the answers. The patch at [0]
works for me and I could work with master smoothly now.Pushed, thanks for testing! I'll do some more testing to see what versions/levels
are affected and file a bug report, but it'll be a few days before I get to it.--
John Naylor
Amazon Web Services
Attachments:
On Mon, Jun 16, 2025 at 06:31:11PM +0000, Devulapalli, Raghuveer wrote:
Attached is a simple reproducer. It passes with clang v16 -O0, but fails
with 17 and 18 only when built with -O0..
I've just started looking into this, but the difference in code generated
for _mm512_castsi128_si512() between gcc, clang 16, and clang 17 looks
interesting.
--
nathan
Great catch! From the intrinsic manual:
Cast vector of type __m128i to type __m512i; the upper 384 bits of the result are undefined.
Replacing that with _mm512_zextsi128_si512 fixes the problem.
Show quoted text
-----Original Message-----
From: Nathan Bossart <nathandbossart@gmail.com>
Sent: Monday, June 16, 2025 3:14 PM
To: Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com>
Cc: John Naylor <johncnaylorls@gmail.com>; Andy Fan
<zhihuifan1213@163.com>; Jesper Pedersen <jesperpedersen.db@gmail.com>;
Tomas Vondra <tomas@vondra.me>; pgsql-hackers@lists.postgresql.org;
Shankaran, Akash <akash.shankaran@intel.com>
Subject: Re: Improve CRC32C performance on SSE4.2On Mon, Jun 16, 2025 at 06:31:11PM +0000, Devulapalli, Raghuveer wrote:
Attached is a simple reproducer. It passes with clang v16 -O0, but
fails with 17 and 18 only when built with -O0..I've just started looking into this, but the difference in code generated for
_mm512_castsi128_si512() between gcc, clang 16, and clang 17 looks interesting.--
nathan
"Devulapalli, Raghuveer" <raghuveer.devulapalli@intel.com> writes:
Great catch! From the intrinsic manual:
Cast vector of type __m128i to type __m512i; the upper 384 bits of the
result are undefined.
Just be curious, what kind of optimization (like what -O2 does) could
mask this issue?
Replacing that with _mm512_zextsi128_si512 fixes the problem.
congratulations!
--
Best Regards
Andy Fan
Just be curious, what kind of optimization (like what -O2 does) could mask this
issue?
= -O1. Only -O0 shows this problem.
Raghuveer
Let me know if this fixes the failure. This is technically not a compiler bug.
Show quoted text
-----Original Message-----
From: Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com>
Sent: Monday, June 16, 2025 4:53 PM
To: Andy Fan <zhihuifan1213@163.com>
Cc: Nathan Bossart <nathandbossart@gmail.com>; John Naylor
<johncnaylorls@gmail.com>; Jesper Pedersen <jesperpedersen.db@gmail.com>;
Tomas Vondra <tomas@vondra.me>; pgsql-hackers@lists.postgresql.org;
Shankaran, Akash <akash.shankaran@intel.com>
Subject: RE: Improve CRC32C performance on SSE4.2Just be curious, what kind of optimization (like what -O2 does) could
mask this issue?= -O1. Only -O0 shows this problem.
Raghuveer
Attachments:
v1-0001-Fix-incorrect-checksum-calculation-when-build-wit.patchapplication/octet-stream; name=v1-0001-Fix-incorrect-checksum-calculation-when-build-wit.patchDownload
From aed317b83f012dea78d5e67f3e6e27af3e4141b2 Mon Sep 17 00:00:00 2001
From: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Date: Mon, 16 Jun 2025 21:16:43 -0700
Subject: [PATCH v1] Fix incorrect checksum calculation when build with -O0 opt
flag
bug: fix checksum errors caused by undefined AVX-512 bits
Problem:
- _mm512_castsi128_si512 leaves upper 384 bits in undefined state
- Results in checksum failures when building PostgreSQL with clang -O0
Solution:
- Use _mm512_zextsi128_si512 to explicitly zero-extend __m128i to __m512i
---
src/port/pg_crc32c_sse42.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 9af3474a6ca..1a717255355 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -123,7 +123,7 @@ pg_comp_crc32c_avx512(pg_crc32c crc, const void *data, size_t len)
__m512i k;
k = _mm512_broadcast_i32x4(_mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0));
- x0 = _mm512_xor_si512(_mm512_castsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
+ x0 = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
buf += 64;
/* Main loop. */
--
2.43.0
On Tue, Jun 17, 2025 at 6:40 AM Andy Fan <zhihuifan1213@163.com> wrote:
"Devulapalli, Raghuveer" <raghuveer.devulapalli@intel.com> writes:
Great catch! From the intrinsic manual:
Cast vector of type __m128i to type __m512i; the upper 384 bits of the
result are undefined.
Thanks Raghuveer and Nathan, for the diagnosis!
Just be curious, what kind of optimization (like what -O2 does) could
mask this issue?
In case Andy is asking about "how" rather than "under what
circumstances", my guess is: -O1+ may have just chosen instructions
that also happen to zero-extend, which are common. -O0 doesn't
represent the naive straightforward structure of what the programmer
wrote, it's more like an "exploded" representation suitable for later
optimization passes. That's why it always looks goofy.
Replacing that with _mm512_zextsi128_si512 fixes the problem.
Here's a patch for testing, which also reverts the previous
workaround. Help welcome, but I still promise to test it in the near
future regardless.
--
John Naylor
Amazon Web Services
Attachments:
v1-zero-extend-instead-of-cast.patchtext/x-patch; charset=US-ASCII; name=v1-zero-extend-instead-of-cast.patchDownload
diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.c
index 9af3474a6ca..1a717255355 100644
--- a/src/port/pg_crc32c_sse42.c
+++ b/src/port/pg_crc32c_sse42.c
@@ -123,7 +123,7 @@ pg_comp_crc32c_avx512(pg_crc32c crc, const void *data, size_t len)
__m512i k;
k = _mm512_broadcast_i32x4(_mm_setr_epi32(0x740eef02, 0, 0x9e4addf8, 0));
- x0 = _mm512_xor_si512(_mm512_castsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
+ x0 = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
buf += 64;
/* Main loop. */
diff --git a/src/port/pg_crc32c_sse42_choose.c b/src/port/pg_crc32c_sse42_choose.c
index 802e47788c1..74d2421ba2b 100644
--- a/src/port/pg_crc32c_sse42_choose.c
+++ b/src/port/pg_crc32c_sse42_choose.c
@@ -95,9 +95,7 @@ pg_comp_crc32c_choose(pg_crc32c crc, const void *data, size_t len)
__cpuidex(exx, 7, 0);
#endif
-#if defined(__clang__) && !defined(__OPTIMIZE__)
- /* Some versions of clang are broken at -O0 */
-#elif defined(USE_AVX512_CRC32C_WITH_RUNTIME_CHECK)
+#ifdef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
if (exx[2] & (1 << 10) && /* VPCLMULQDQ */
exx[1] & (1 << 31)) /* AVX512-VL */
pg_comp_crc32c = pg_comp_crc32c_avx512;
On Tue, Jun 17, 2025 at 03:55:06PM +0700, John Naylor wrote:
"Devulapalli, Raghuveer" <raghuveer.devulapalli@intel.com> writes:
Replacing that with _mm512_zextsi128_si512 fixes the problem.
Here's a patch for testing, which also reverts the previous
workaround. Help welcome, but I still promise to test it in the near
future regardless.
LGTM
--
nathan
In case Andy is asking about "how" rather than "under what circumstances", my
guess is: -O1+ may have just chosen instructions that also happen to zero-extend,
which are common. -O0 doesn't represent the naive straightforward structure of
what the programmer wrote, it's more like an "exploded" representation suitable
for later optimization passes. That's why it always looks goofy.
Hah yeah. I missed the "how" part of the question but your explanation makes sense.
Replacing that with _mm512_zextsi128_si512 fixes the problem.
Here's a patch for testing, which also reverts the previous workaround. Help
welcome, but I still promise to test it in the near future regardless.
LGTM.
Raghuveer
John Naylor <johncnaylorls@gmail.com> writes:
Hi,
Just be curious, what kind of optimization (like what -O2 does) could
mask this issue?In case Andy is asking about "how" rather than "under what
circumstances", my guess is: -O1+ may have just chosen instructions
that also happen to zero-extend, which are common. -O0 doesn't
represent the naive straightforward structure of what the programmer
wrote, it's more like an "exploded" representation suitable for later
optimization passes. That's why it always looks goofy.
Thanks for the explaination!
Replacing that with _mm512_zextsi128_si512 fixes the problem.
Here's a patch for testing, which also reverts the previous
workaround. Help welcome, but I still promise to test it in the near
future regardless.
I verified the your patch, it works for me.
--
Best Regards
Andy Fan
Verified that the patch works w/ clang-19 -O0 and that avx-512 was selected
for
the CRC at runtime. Thanks for fixing this.
Regards,
Deep (VMware)
On Tue, Jun 17, 2025 at 3:55 PM John Naylor <johncnaylorls@gmail.com> wrote:
Replacing that with _mm512_zextsi128_si512 fixes the problem.
Here's a patch for testing, which also reverts the previous
workaround.
Pushed, thanks everyone!
--
John Naylor
Amazon Web Services
John Naylor <johncnaylorls@gmail.com> writes:
Pushed, thanks everyone!
This has broken the build completely on my RHEL8 x86_64 box,
with gcc 8.5.0:
$ ./configure ...
$ make -s
pg_crc32c_sse42.c: In function 'pg_comp_crc32c_avx512':
pg_crc32c_sse42.c:126:25: warning: implicit declaration of function '_mm512_zextsi128_si512'; did you mean '_mm512_castsi128_si512'? [-Wimplicit-function-declaration]
x0 = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
^~~~~~~~~~~~~~~~~~~~~~
_mm512_castsi128_si512
pg_crc32c_sse42.c:126:25: error: incompatible type for argument 1 of '_mm512_xor_si512'
x0 = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-redhat-linux/8/include/immintrin.h:45,
from pg_crc32c_sse42.c:19:
/usr/lib/gcc/x86_64-redhat-linux/8/include/avx512fintrin.h:7267:27: note: expected '__m512i' {aka '__vector(8) long long int'} but argument is of type 'int'
_mm512_xor_si512 (__m512i __A, __m512i __B)
~~~~~~~~^~~
make[2]: *** [<builtin>: pg_crc32c_sse42.o] Error 1
I see similar symptoms on buildfarm animal conchuela, which
is DragonFly BSD of some vintage or other. Not sure why
more animals aren't complaining. Anyway, it seems that the
configure probe to see if this facility is available had
better be adjusted to match the new code.
regards, tom lane
On Mon, Jun 23, 2025 at 10:51:21AM -0400, Tom Lane wrote:
This has broken the build completely on my RHEL8 x86_64 box,
with gcc 8.5.0:$ ./configure ...
$ make -s
pg_crc32c_sse42.c: In function 'pg_comp_crc32c_avx512':
pg_crc32c_sse42.c:126:25: warning: implicit declaration of function '_mm512_zextsi128_si512'; did you mean '_mm512_castsi128_si512'? [-Wimplicit-function-declaration]
x0 = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
^~~~~~~~~~~~~~~~~~~~~~
_mm512_castsi128_si512
It looks like thse weren't added until GCC 10 [0]https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83250.
I see similar symptoms on buildfarm animal conchuela, which
is DragonFly BSD of some vintage or other. Not sure why
more animals aren't complaining. Anyway, it seems that the
configure probe to see if this facility is available had
better be adjusted to match the new code.
Unfortunately, this will probably require more than replacing
_mm512_castsi512_si128 with _mm512_zextsi512_si128 because the latter
doesn't exist.
[0]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83250
--
nathan
Nathan Bossart <nathandbossart@gmail.com> writes:
On Mon, Jun 23, 2025 at 10:51:21AM -0400, Tom Lane wrote:
This has broken the build completely on my RHEL8 x86_64 box,
with gcc 8.5.0:
Unfortunately, this will probably require more than replacing
_mm512_castsi512_si128 with _mm512_zextsi512_si128 because the latter
doesn't exist.
I was imagining just including _mm512_zextsi128_si512() in the
code being tested during configure, so that we fall back to
the non-AVX-512 code if the compiler is too old to have it.
I don't really feel a need to work harder than that.
regards, tom lane
On Mon, Jun 23, 2025 at 11:10:45AM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
On Mon, Jun 23, 2025 at 10:51:21AM -0400, Tom Lane wrote:
This has broken the build completely on my RHEL8 x86_64 box,
with gcc 8.5.0:Unfortunately, this will probably require more than replacing
_mm512_castsi512_si128 with _mm512_zextsi512_si128 because the latter
doesn't exist.I was imagining just including _mm512_zextsi128_si512() in the
code being tested during configure, so that we fall back to
the non-AVX-512 code if the compiler is too old to have it.
I don't really feel a need to work harder than that.
Sorry, my note wasn't clear. Right now, the configure test uses
_mm512_castsi512_si128(), so we can't just do a simple s/cast/zext. We'll
need to make a slightly bigger modification to the test to make sure the
zext intrinsics are understood. I agree that we needn't work any harder
than that.
--
nathan
Nathan Bossart <nathandbossart@gmail.com> writes:
On Mon, Jun 23, 2025 at 11:10:45AM -0400, Tom Lane wrote:
I was imagining just including _mm512_zextsi128_si512() in the
code being tested during configure, so that we fall back to
the non-AVX-512 code if the compiler is too old to have it.
I don't really feel a need to work harder than that.
Sorry, my note wasn't clear. Right now, the configure test uses
_mm512_castsi512_si128(), so we can't just do a simple s/cast/zext. We'll
need to make a slightly bigger modification to the test to make sure the
zext intrinsics are understood. I agree that we needn't work any harder
than that.
The code still uses _mm512_castsi512_si128, so I think removing it
from the configure snippet might not be bright. I adapted what's
there now to get the attached, which builds successfully on my old
compiler. I still need to check it on a newer compiler.
regards, tom lane
Attachments:
test-for-new-AVX-intrinsic-too.patchtext/x-diff; charset=us-ascii; name=test-for-new-AVX-intrinsic-too.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 5f3e1d1faf9..da40bd6a647 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -602,6 +602,7 @@ AC_CACHE_CHECK([for _mm512_clmulepi64_epi128], [Ac_cachevar],
{
__m128i z;
+ x = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(0)), x);
y = _mm512_clmulepi64_epi128(x, y, 0);
z = _mm_ternarylogic_epi64(
_mm512_castsi512_si128(y),
diff --git a/configure b/configure
index 4f15347cc95..3d3d3db97a4 100755
--- a/configure
+++ b/configure
@@ -18227,6 +18227,7 @@ else
{
__m128i z;
+ x = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(0)), x);
y = _mm512_clmulepi64_epi128(x, y, 0);
z = _mm_ternarylogic_epi64(
_mm512_castsi512_si128(y),
diff --git a/meson.build b/meson.build
index 474763ad19f..6ffe7b47275 100644
--- a/meson.build
+++ b/meson.build
@@ -2465,6 +2465,7 @@ int main(void)
{
__m128i z;
+ x = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(0)), x);
y = _mm512_clmulepi64_epi128(x, y, 0);
z = _mm_ternarylogic_epi64(
_mm512_castsi512_si128(y),
On Mon, Jun 23, 2025 at 11:29:32AM -0400, Tom Lane wrote:
The code still uses _mm512_castsi512_si128, so I think removing it
from the configure snippet might not be bright.
Ah, right. I'm not firing on all cylinders this morning, apparently.
I adapted what's
there now to get the attached, which builds successfully on my old
compiler. I still need to check it on a newer compiler.
LGTM
--
nathan
On Mon, Jun 23, 2025 at 10:05 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
On Mon, Jun 23, 2025 at 10:51:21AM -0400, Tom Lane wrote:
This has broken the build completely on my RHEL8 x86_64 box,
with gcc 8.5.0:$ ./configure ...
$ make -s
pg_crc32c_sse42.c: In function 'pg_comp_crc32c_avx512':
pg_crc32c_sse42.c:126:25: warning: implicit declaration of function '_mm512_zextsi128_si512'; did you mean '_mm512_castsi128_si512'? [-Wimplicit-function-declaration]
x0 = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(crc0)), x0);
^~~~~~~~~~~~~~~~~~~~~~
_mm512_castsi128_si512It looks like thse weren't added until GCC 10 [0].
Huh, that's surprising because the Intel manual put it in AVX-512F,
the basic core around which everything else is tacked on.
--
John Naylor
Amazon Web Services
On Tue, Jun 17, 2025 at 1:55 AM John Naylor <johncnaylorls@gmail.com> wrote:
I took the minimal repro from [1]/messages/by-id/PH8PR11MB8286A89AF2B104044187E54DFB70A@PH8PR11MB8286.namprd11.prod.outlook.com and took a look at the code generated
between clang 17 -O0 [2]https://godbolt.org/z/ahx9PePYr and clang 17 -O3 [3]https://godbolt.org/z/W4WPzjnbb. I saw that -O3 (and
actually -O1 and -O2) generated the following code for:
castval = _mm512_castsi128_si512(_mm_cvtsi32_si128(crc0));
x0 = _mm512_xor_si512(castval, x0);
vinserti128 ymm0, ymm0, xmmword ptr [rip + .LCPI1_0], 0
vpxorq zmm0, zmm0, zmmword ptr [rdi]
Reading vpxorq's pseudocode [4]https://www.felixcloutier.com/x86/pxor#vpxorq--evex-encoded-versions-, it seems that it zeroes out the leading
bits:
DEST[MAXVL-1:VL] := 0
Same thing for clang 17 -O0, if we are using _mm512_zextsi128_si512
instead [5]https://godbolt.org/z/46brvrnnv - vpxor and vbroadcast128 are used which seem to also
zero out leading bits.
So, -O1..-O3 were indeed emitting instructions that zero-extend and, thus
avoiding the undefined behavior.
[1]: /messages/by-id/PH8PR11MB8286A89AF2B104044187E54DFB70A@PH8PR11MB8286.namprd11.prod.outlook.com
/messages/by-id/PH8PR11MB8286A89AF2B104044187E54DFB70A@PH8PR11MB8286.namprd11.prod.outlook.com
[2]: https://godbolt.org/z/ahx9PePYr
[3]: https://godbolt.org/z/W4WPzjnbb
[4]: https://www.felixcloutier.com/x86/pxor#vpxorq--evex-encoded-versions-
[5]: https://godbolt.org/z/46brvrnnv
Regards,
Deep (VMware)