[PATCH] Fix crash in int8_avg_combine().

Started by Hadi Moshayediabout 8 years ago5 messages
#1Hadi Moshayedi
hadi@moshayedi.net
1 attachment(s)

While doing some tests on REL_10_STABLE, I was getting run-time exceptions
at int8_avg_combine() at the following line:

state1->sumX = state2->sumX;

After some debugging, I noticed that palloc()’s alignment is 8-bytes, while
this statement (which moves a __int128 from one memory location to another
memory location) expects 16-byte memory alignments. So when either state1
or state2 is not 16-byte aligned, this crashes.

When I disassemble the code, the above statement is translated to a pair of
movdqa and movaps assignments when compiled with -O2:

movdqa c(%rdx), %xmm0
movaps %xmm0, c(%rcx)

Looking at “Intel 64 and IA-32 Architectures Software Developer’s Manual,
Volume 2”, both of these instructions expect 16-byte aligned memory
locations, or a general-protection exception will be generated.

I wasn’t getting the exception always. For example, the same query that
crashed in REL_10_STABLE didn’t crash in master. But I think that was just
lucky allocation, and we will eventually see cases of this crash in all
versions that use __int128 assignment in int8_avg_combine().

I’ve attached a patch which sets the MAXIMUM_ALIGNOF correctly when
__int128’s are available, so fixes the crash. This patch is on top of
REL_10_STABLE, which is the branch I was getting the crash at. I can create
patches for other branches if we think this is the proper change.

The sequence for which I got the crash in REL_10_STABLE was the following
sequence of commands after a fresh initdb:

CREATE TABLE t(a BIGINT);
INSERT INTO t SELECT * FROM generate_series(1, 10000000);
SELECT sum(a) FROM t;

I’m not sure if this will crash for everyone, since you can be lucky and
have both states assigned 16-byte aligned locations. This was the case for
me in master branch. "SELECT version()" is "PostgreSQL 10.0 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.2.1 20170915 (Red Hat
7.2.1-2), 64-bit", if that is related.

I notice that there’s a comment in configure.in that says the penalty of
using 16-bit alignment might be too much for disk and memory space. If this
is the case, then we need to modify numeric.c to make sure that it never
use any instructions with 16-byte alignment requirements. I can do that if
that is the consensus.

Or maybe we can create an allocation function for 16-byte aligned
allocations.

Any thoughts?

-- Hadi

Attachments:

fix_alignment.REL10_0_STABLE.patchtext/x-patch; charset=US-ASCII; name=fix_alignment.REL10_0_STABLE.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 7275ea69fe..9fea911e9c 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -126,6 +126,7 @@ return 1;
 ])],
 [pgac_cv__128bit_int=yes],
 [pgac_cv__128bit_int=no])])
+HAVE_128BIT_INT=$pgac_cv__128bit_int
 if test x"$pgac_cv__128bit_int" = xyes ; then
   AC_DEFINE(PG_INT128_TYPE, __int128, [Define to the name of a signed 128-bit integer type.])
 fi])# PGAC_TYPE_128BIT_INT
diff --git a/configure b/configure
index 5a123e8a8a..1813ea3414 100755
--- a/configure
+++ b/configure
@@ -14624,6 +14624,59 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
+# Check for extensions offering the integer scalar type __int128.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __int128" >&5
+$as_echo_n "checking for __int128... " >&6; }
+if ${pgac_cv__128bit_int+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/*
+ * These are globals to discourage the compiler from folding all the
+ * arithmetic tests down to compile-time constants.  We do not have
+ * convenient support for 64bit literals at this point...
+ */
+__int128 a = 48828125;
+__int128 b = 97656255;
+
+int
+main ()
+{
+
+__int128 c,d;
+a = (a << 12) + 1; /* 200000000001 */
+b = (b << 12) + 5; /* 400000000005 */
+/* use the most relevant arithmetic ops */
+c = a * b;
+d = (c + b) / b;
+/* return different values, to prevent optimizations */
+if (d != a+1)
+  return 0;
+return 1;
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv__128bit_int=yes
+else
+  pgac_cv__128bit_int=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__128bit_int" >&5
+$as_echo "$pgac_cv__128bit_int" >&6; }
+HAVE_128BIT_INT=$pgac_cv__128bit_int
+if test x"$pgac_cv__128bit_int" = xyes ; then
+
+$as_echo "#define PG_INT128_TYPE __int128" >>confdefs.h
+
+fi
+
 # Determine memory alignment requirements for the basic C data types.
 
 # The cast to long int works around a bug in the HP C Compiler,
@@ -14767,6 +14820,43 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
+fi
+if test x"$HAVE_128BIT_INT" = x"yes" ; then
+  # The cast to long int works around a bug in the HP C Compiler,
+# see AC_CHECK_SIZEOF for more information.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking alignment of __int128" >&5
+$as_echo_n "checking alignment of __int128... " >&6; }
+if ${ac_cv_alignof___int128+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if ac_fn_c_compute_int "$LINENO" "(long int) offsetof (ac__type_alignof_, y)" "ac_cv_alignof___int128"        "$ac_includes_default
+#ifndef offsetof
+# define offsetof(type, member) ((char *) &((type *) 0)->member - (char *) 0)
+#endif
+typedef struct { char x; __int128 y; } ac__type_alignof_;"; then :
+
+else
+  if test "$ac_cv_type___int128" = yes; then
+     { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error 77 "cannot compute alignment of __int128
+See \`config.log' for more details" "$LINENO" 5; }
+   else
+     ac_cv_alignof___int128=0
+   fi
+fi
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_alignof___int128" >&5
+$as_echo "$ac_cv_alignof___int128" >&6; }
+
+
+
+cat >>confdefs.h <<_ACEOF
+#define ALIGNOF___INT128 $ac_cv_alignof___int128
+_ACEOF
+
+
 fi
 # The cast to long int works around a bug in the HP C Compiler,
 # see AC_CHECK_SIZEOF for more information.
@@ -14815,6 +14905,9 @@ fi
 if test x"$HAVE_LONG_LONG_INT_64" = xyes && test $MAX_ALIGNOF -lt $ac_cv_alignof_long_long_int ; then
   MAX_ALIGNOF="$ac_cv_alignof_long_long_int"
 fi
+if test x"$HAVE_128BIT_INT" = xyes && test $MAX_ALIGNOF -lt $ac_cv_alignof___int128 ; then
+  MAX_ALIGNOF="$ac_cv_alignof___int128"
+fi
 
 cat >>confdefs.h <<_ACEOF
 #define MAXIMUM_ALIGNOF $MAX_ALIGNOF
@@ -14866,58 +14959,6 @@ _ACEOF
 fi
 
 
-# Check for extensions offering the integer scalar type __int128.
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __int128" >&5
-$as_echo_n "checking for __int128... " >&6; }
-if ${pgac_cv__128bit_int+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-
-/*
- * These are globals to discourage the compiler from folding all the
- * arithmetic tests down to compile-time constants.  We do not have
- * convenient support for 64bit literals at this point...
- */
-__int128 a = 48828125;
-__int128 b = 97656255;
-
-int
-main ()
-{
-
-__int128 c,d;
-a = (a << 12) + 1; /* 200000000001 */
-b = (b << 12) + 5; /* 400000000005 */
-/* use the most relevant arithmetic ops */
-c = a * b;
-d = (c + b) / b;
-/* return different values, to prevent optimizations */
-if (d != a+1)
-  return 0;
-return 1;
-
-  ;
-  return 0;
-}
-_ACEOF
-if ac_fn_c_try_link "$LINENO"; then :
-  pgac_cv__128bit_int=yes
-else
-  pgac_cv__128bit_int=no
-fi
-rm -f core conftest.err conftest.$ac_objext \
-    conftest$ac_exeext conftest.$ac_ext
-fi
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__128bit_int" >&5
-$as_echo "$pgac_cv__128bit_int" >&6; }
-if test x"$pgac_cv__128bit_int" = xyes ; then
-
-$as_echo "#define PG_INT128_TYPE __int128" >>confdefs.h
-
-fi
-
 # Check for various atomic operations now that we have checked how to declare
 # 64bit integers.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for builtin __sync char locking functions" >&5
diff --git a/configure.in b/configure.in
index 68a1e0c184..e4a5ae7a10 100644
--- a/configure.in
+++ b/configure.in
@@ -1814,6 +1814,9 @@ fi
 AC_MSG_RESULT([$enable_float8_byval])
 AC_DEFINE_UNQUOTED([FLOAT8PASSBYVAL], [$float8passbyval], [float8, int8, and related values are passed by value if 'true', by reference if 'false'])
 
+# Check for extensions offering the integer scalar type __int128.
+PGAC_TYPE_128BIT_INT
+
 # Determine memory alignment requirements for the basic C data types.
 
 AC_CHECK_ALIGNOF(short)
@@ -1822,6 +1825,9 @@ AC_CHECK_ALIGNOF(long)
 if test x"$HAVE_LONG_LONG_INT_64" = x"yes" ; then
   AC_CHECK_ALIGNOF(long long int)
 fi
+if test x"$HAVE_128BIT_INT" = x"yes" ; then
+  AC_CHECK_ALIGNOF(__int128)
+fi
 AC_CHECK_ALIGNOF(double)
 
 # Compute maximum alignment of any basic type.
@@ -1835,6 +1841,9 @@ fi
 if test x"$HAVE_LONG_LONG_INT_64" = xyes && test $MAX_ALIGNOF -lt $ac_cv_alignof_long_long_int ; then
   MAX_ALIGNOF="$ac_cv_alignof_long_long_int"
 fi
+if test x"$HAVE_128BIT_INT" = xyes && test $MAX_ALIGNOF -lt $ac_cv_alignof___int128 ; then
+  MAX_ALIGNOF="$ac_cv_alignof___int128"
+fi
 AC_DEFINE_UNQUOTED(MAXIMUM_ALIGNOF, $MAX_ALIGNOF, [Define as the maximum alignment requirement of any C data type.])
 
 
@@ -1843,9 +1852,6 @@ AC_DEFINE_UNQUOTED(MAXIMUM_ALIGNOF, $MAX_ALIGNOF, [Define as the maximum alignme
 AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [],
 [#include <stdio.h>])
 
-# Check for extensions offering the integer scalar type __int128.
-PGAC_TYPE_128BIT_INT
-
 # Check for various atomic operations now that we have checked how to declare
 # 64bit integers.
 PGAC_HAVE_GCC__SYNC_CHAR_TAS
diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index 7033042e2d..9311fb7852 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -68,7 +68,7 @@
  * CAUTION: ALLOC_MINBITS must be large enough so that
  * 1<<ALLOC_MINBITS is at least MAXALIGN,
  * or we may fail to align the smallest chunks adequately.
- * 8-byte alignment is enough on all currently known machines.
+ * 16-byte alignment is enough on all currently known machines.
  *
  * With the current parameters, request sizes up to 8K are treated as chunks,
  * larger requests go into dedicated blocks.  Change ALLOCSET_NUM_FREELISTS
@@ -78,7 +78,7 @@
  *--------------------
  */
 
-#define ALLOC_MINBITS		3	/* smallest chunk size is 8 bytes */
+#define ALLOC_MINBITS		4	/* smallest chunk size is 16 bytes */
 #define ALLOCSET_NUM_FREELISTS	11
 #define ALLOC_CHUNK_LIMIT	(1 << (ALLOCSET_NUM_FREELISTS-1+ALLOC_MINBITS))
 /* Size of largest chunk that we use a fixed size for */
@@ -166,7 +166,7 @@ typedef struct AllocChunkData
 	/* when debugging memory usage, also store actual requested size */
 	/* this is zero in a free chunk */
 	Size		requested_size;
-#if MAXIMUM_ALIGNOF > 4 && SIZEOF_VOID_P == 4
+#if (MAXIMUM_ALIGNOF > 4 && SIZEOF_VOID_P == 4) || (MAXIMUM_ALIGNOF > 8 && SIZEOF_VOID_P == 8)
 	Size		padding;
 #endif
 
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c65dd7db21..3c99af4667 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -30,6 +30,9 @@
 /* The normal alignment of `short', in bytes. */
 #undef ALIGNOF_SHORT
 
+/* The normal alignment of `__int128', in bytes. */
+#undef ALIGNOF___INT128
+
 /* Size of a disk block --- this also limits the size of a tuple. You can set
    it bigger if you need bigger tuples (although TOAST should reduce the need
    to have large tuples, since fields can be spread across multiple tuples).
diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index 869c59dc85..2dc59e89cd 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -189,7 +189,7 @@ extern MemoryContext SlabContextCreate(MemoryContext parent,
  * Few callers should be interested in this, but tuplesort/tuplestore need
  * to know it.
  */
-#define ALLOCSET_SEPARATE_THRESHOLD  8192
+#define ALLOCSET_SEPARATE_THRESHOLD  16384
 
 #define SLAB_DEFAULT_BLOCK_SIZE		(8 * 1024)
 #define SLAB_LARGE_BLOCK_SIZE		(8 * 1024 * 1024)
#2Andres Freund
andres@citusdata.com
In reply to: Hadi Moshayedi (#1)
Re: [PATCH] Fix crash in int8_avg_combine().

Hi Hadi,

On 2017-11-25 22:43:49 -0500, Hadi Moshayedi wrote:

While doing some tests on REL_10_STABLE, I was getting run-time exceptions
at int8_avg_combine() at the following line:

state1->sumX = state2->sumX;

After some debugging, I noticed that palloc()’s alignment is 8-bytes, while
this statement (which moves a __int128 from one memory location to another
memory location) expects 16-byte memory alignments. So when either state1
or state2 is not 16-byte aligned, this crashes.

When I disassemble the code, the above statement is translated to a pair of
movdqa and movaps assignments when compiled with -O2:

movdqa c(%rdx), %xmm0
movaps %xmm0, c(%rcx)

Looking at “Intel 64 and IA-32 Architectures Software Developer’s Manual,
Volume 2”, both of these instructions expect 16-byte aligned memory
locations, or a general-protection exception will be generated.

Nicely analyzed. [Un]fortunately the bug has already been found and
fixed:
https://git.postgresql.org/pg/commitdiff/619a8c47da7279c186bb57cc16b26ad011366b73

Will be included in the next set of back branch releases.

diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index 869c59dc85..2dc59e89cd 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -189,7 +189,7 @@ extern MemoryContext SlabContextCreate(MemoryContext parent,
* Few callers should be interested in this, but tuplesort/tuplestore need
* to know it.
*/
-#define ALLOCSET_SEPARATE_THRESHOLD  8192
+#define ALLOCSET_SEPARATE_THRESHOLD  16384

Huh, what's that about in this context?

Greetings,

Andres Freund

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hadi Moshayedi (#1)
Re: [PATCH] Fix crash in int8_avg_combine().

Hadi Moshayedi <hadi@moshayedi.net> writes:

While doing some tests on REL_10_STABLE, I was getting run-time exceptions
at int8_avg_combine() at the following line:
state1->sumX = state2->sumX;
After some debugging, I noticed that palloc()’s alignment is 8-bytes, while
this statement (which moves a __int128 from one memory location to another
memory location) expects 16-byte memory alignments. So when either state1
or state2 is not 16-byte aligned, this crashes.

I believe we already dealt with this:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: REL_10_STABLE [619a8c47d] 2017-11-14 17:49:49 -0500
Branch: REL9_6_STABLE [4a15f87d2] 2017-11-14 17:49:49 -0500
Branch: REL9_5_STABLE [d4e38489f] 2017-11-14 17:49:49 -0500

Prevent int128 from requiring more than MAXALIGN alignment.

Our initial work with int128 neglected alignment considerations, an
oversight that came back to bite us in bug #14897 from Vincent Lachenal.
It is unsurprising that int128 might have a 16-byte alignment requirement;
what's slightly more surprising is that even notoriously lax Intel chips
sometimes enforce that.

Raising MAXALIGN seems out of the question: the costs in wasted disk and
memory space would be significant, and there would also be an on-disk
compatibility break. Nor does it seem very practical to try to allow some
data structures to have more-than-MAXALIGN alignment requirement, as we'd
have to push knowledge of that throughout various code that copies data
structures around.

The only way out of the box is to make type int128 conform to the system's
alignment assumptions. Fortunately, gcc supports that via its
__attribute__(aligned()) pragma; and since we don't currently support
int128 on non-gcc-workalike compilers, we shouldn't be losing any platform
support this way.

Although we could have just done pg_attribute_aligned(MAXIMUM_ALIGNOF) and
called it a day, I did a little bit of extra work to make the code more
portable than that: it will also support int128 on compilers without
__attribute__(aligned()), if the native alignment of their 128-bit-int
type is no more than that of int64.

Add a regression test case that exercises the one known instance of the
problem, in parallel aggregation over a bigint column.

Back-patch of commit 751804998. The code known to be affected only exists
in 9.6 and later, but we do have some stuff using int128 in 9.5, so patch
back to 9.5.

Discussion: /messages/by-id/20171110185747.31519.28038@wrigleys.postgresql.org

Does that solution not work for you?

regards, tom lane

#4Hadi Moshayedi
hadi@moshayedi.net
In reply to: Tom Lane (#3)
Re: [PATCH] Fix crash in int8_avg_combine().

On Sat, Nov 25, 2017 at 10:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I believe we already dealt with this:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: REL_10_STABLE [619a8c47d] 2017-11-14 17:49:49 -0500
Branch: REL9_6_STABLE [4a15f87d2] 2017-11-14 17:49:49 -0500
Branch: REL9_5_STABLE [d4e38489f] 2017-11-14 17:49:49 -0500

Prevent int128 from requiring more than MAXALIGN alignment.

Our initial work with int128 neglected alignment considerations, an
oversight that came back to bite us in bug #14897 from Vincent
Lachenal.
It is unsurprising that int128 might have a 16-byte alignment
requirement;
what's slightly more surprising is that even notoriously lax Intel
chips
sometimes enforce that.

Raising MAXALIGN seems out of the question: the costs in wasted disk
and
memory space would be significant, and there would also be an on-disk
compatibility break. Nor does it seem very practical to try to allow
some
data structures to have more-than-MAXALIGN alignment requirement, as
we'd
have to push knowledge of that throughout various code that copies data
structures around.

The only way out of the box is to make type int128 conform to the
system's
alignment assumptions. Fortunately, gcc supports that via its
__attribute__(aligned()) pragma; and since we don't currently support
int128 on non-gcc-workalike compilers, we shouldn't be losing any
platform
support this way.

Although we could have just done pg_attribute_aligned(MAXIMUM_ALIGNOF)
and
called it a day, I did a little bit of extra work to make the code more
portable than that: it will also support int128 on compilers without
__attribute__(aligned()), if the native alignment of their 128-bit-int
type is no more than that of int64.

Add a regression test case that exercises the one known instance of the
problem, in parallel aggregation over a bigint column.

Back-patch of commit 751804998. The code known to be affected only
exists
in 9.6 and later, but we do have some stuff using int128 in 9.5, so
patch
back to 9.5.

Discussion: /messages/by-id/20171110185747.31519.28038@
wrigleys.postgresql.org

Does that solution not work for you?

It works for me. My REL_10_STABLE was out of date. A git pull fixed it.

#5Hadi Moshayedi
hadi@moshayedi.net
In reply to: Andres Freund (#2)
Re: [PATCH] Fix crash in int8_avg_combine().

On Sat, Nov 25, 2017 at 10:47 PM, Andres Freund <andres@citusdata.com>
wrote:

diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index 869c59dc85..2dc59e89cd 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -189,7 +189,7 @@ extern MemoryContext SlabContextCreate(MemoryContext

parent,

* Few callers should be interested in this, but tuplesort/tuplestore

need

* to know it.
*/
-#define ALLOCSET_SEPARATE_THRESHOLD  8192
+#define ALLOCSET_SEPARATE_THRESHOLD  16384

Huh, what's that about in this context?

There is following static assert:

StaticAssertStmt(ALLOC_CHUNK_LIMIT == ALLOCSET_SEPARATE_THRESHOLD, ...)

and ALLOCK_CHUNK_LIMIT is defined as:

#define ALLOC_CHUNK_LIMIT (1 << (ALLOCSET_NUM_FREELISTS-1+ALLOC_MINBITS))

and there is this comment:

"CAUTION: ALLOC_MINBITS must be large enough so that 1<<ALLOC_MINBITS is at
least MAXALIGN, ..."

So I increased ALLOC_MINBITS which resulted in doubling
ALLOCSET_SEPARATE_THRESHOLD.

Thanks,
-- Hadi