Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Started by David Rowleyabout 3 years ago18 messages
#1David Rowley
dgrowleyml@gmail.com
1 attachment(s)

Part of the work that Thomas mentions in [1]/messages/by-id/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com, regarding Direct I/O,
has certain requirements that pointers must be page-aligned.

I've attached a patch which implements palloc_aligned() and
MemoryContextAllocAligned() which accept an 'alignto' parameter which
must be a power-of-2 value. The memory addresses returned by these
functions will be aligned by the requested alignment.

Primarily, this work is by Andres. I took what he had and cleaned it
up, fixed a few minor bugs then implemented repalloc() and
GetMemoryChunkSpace() functionality.

The way this works is that palloc_aligned() can be called for any of
the existing MemoryContext types. What we do is perform a normal
allocation request, but we add additional bytes to the size request to
allow the proper alignment of the pointer that we return. Since we
have no control over the alignment of the return value from the
allocation requests, we must adjust the pointer returned by the
allocation function to align it to the required alignment.

When an operation such as pfree() or repalloc() is performed on a
pointer retuned by palloc_aligned(), we can't go trying to pfree the
aligned pointer as this is not the pointer that was returned by the
allocation function. To make all this work, another MemoryChunk
struct exists directly before the aligned pointer which has the
MemoryContextMethodID set to MCTX_ALIGNED_REDIRECT_ID. These
"redirection" MemoryChunks have the "block offset" set to allow the
actual MemoryChunk of the original allocation to be found. We just
subtract the number of bytes stored in the block offset, which is just
the same as how we now find the owning AllocBlock from the MemoryChunk
in aset.c. Once we do that offset calculation, we can just pfree()
the original chunk.

The 'alignto' is stored in this "redirection" MemoryChunk so that
repalloc() knows what the original alignment request was so that it
the repalloc'd chunk can be aligned by that amount too.

In the attached patch, there are not yet any users of these new 2
functions. As mentioned above, Thomas is proposing some patches to
implement Direct I/O in [1]/messages/by-id/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com which will use these functions. Because I
touched memory contexts last, it likely makes the most sense for me to
work on this portion of the patch.

Comments welcome.

Patch attached.

(I did rip out all the I/O specific portions from Andres' patch which
Thomas proposes as his 0002 patch. Thomas will need to rebase
(sorry)).

David

[1]: /messages/by-id/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com

Attachments:

palloc_aligned.patchtext/plain; charset=US-ASCII; name=palloc_aligned.patchDownload
diff --git a/src/backend/utils/mmgr/Makefile b/src/backend/utils/mmgr/Makefile
index 3b4cfdbd52..dae3432c98 100644
--- a/src/backend/utils/mmgr/Makefile
+++ b/src/backend/utils/mmgr/Makefile
@@ -13,6 +13,7 @@ top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
 OBJS = \
+	alignedalloc.o \
 	aset.o \
 	dsa.o \
 	freepage.o \
diff --git a/src/backend/utils/mmgr/alignedalloc.c b/src/backend/utils/mmgr/alignedalloc.c
new file mode 100644
index 0000000000..e581772758
--- /dev/null
+++ b/src/backend/utils/mmgr/alignedalloc.c
@@ -0,0 +1,93 @@
+/*-------------------------------------------------------------------------
+ *
+ * alignedalloc.c
+ *	  Allocator functions to implement palloc_aligned
+ *
+ * This is not a fully fledged MemoryContext type as there is no means to
+ * create a MemoryContext of this type.  The code here only serves to allow
+ * operations such as pfree() and repalloc() to work correctly on a memory
+ * chunk that was allocated by palloc_align().
+ *
+ * Portions Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/mmgr/alignedalloc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/memdebug.h"
+#include "utils/memutils_memorychunk.h"
+
+void
+AlignedAllocFree(void *pointer)
+{
+	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
+	void *unaligned;
+
+	Assert(!MemoryChunkIsExternal(chunk));
+
+	unaligned = MemoryChunkGetBlock(chunk);
+
+	pfree(unaligned);
+}
+
+void *
+AlignedAllocRealloc(void *pointer, Size size)
+{
+	MemoryChunk	   *redirchunk = PointerGetMemoryChunk(pointer);
+	Size		alignto = MemoryChunkGetValue(redirchunk);
+	void	   *unaligned = MemoryChunkGetBlock(redirchunk);
+	MemoryChunk	   *chunk = PointerGetMemoryChunk(unaligned);
+	Size			old_size;
+	void		   *newptr;
+
+	/* sanity check this is a power of 2 value */
+	Assert((alignto & (alignto - 1)) == 0);
+
+	/*
+	 * Determine the size of the original allocation.  We can't determine this
+	 * exactly as GetMemoryChunkSpace() returns the total space used for the
+	 * allocation, which for contexts like aset includes rounding up to the
+	 * next power of 2.  However, this value is just used to memcpy() the old
+	 * data into the new allocation, so we only need to concern ourselves with
+	 * not reading beyond the end of the original allocation's memory.  The
+	 * drawback here is that we may copy more bytes than we need to, which
+	 * amounts only to wasted effort.
+	 */
+	old_size = GetMemoryChunkSpace(unaligned) -
+		((char *)pointer - (char *)chunk);
+
+	newptr = palloc_aligned(size, alignto, 0);
+
+	/*
+	 * We may memcpy beyond the end of the orignal allocation request size, so
+	 * we must mark the entire allocation as defined.
+	 */
+	VALGRIND_MAKE_MEM_DEFINED(pointer, old_size);
+	memcpy(newptr, pointer, Min(size, old_size));
+	pfree(unaligned);
+
+	return newptr;
+}
+
+MemoryContext
+AlignedAllocGetChunkContext(void *pointer)
+{
+	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
+
+	Assert(!MemoryChunkIsExternal(chunk));
+
+	return GetMemoryChunkContext(MemoryChunkGetBlock(chunk));
+}
+
+Size
+AlignedGetChunkSpace(void *pointer)
+{
+	MemoryChunk	   *redirchunk = PointerGetMemoryChunk(pointer);
+	void	   *unaligned = MemoryChunkGetBlock(redirchunk);
+
+	return GetMemoryChunkSpace(unaligned);
+}
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index f526ca82c1..cd2c43efb3 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -30,6 +30,7 @@
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 #include "utils/memutils_internal.h"
+#include "utils/memutils_memorychunk.h"
 
 
 static void BogusFree(void *pointer);
@@ -84,6 +85,21 @@ static const MemoryContextMethods mcxt_methods[] = {
 	[MCTX_SLAB_ID].check = SlabCheck,
 #endif
 
+	/* alignedalloc.c */
+	[MCTX_ALIGNED_REDIRECT_ID].alloc = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].free_p = AlignedAllocFree,
+	[MCTX_ALIGNED_REDIRECT_ID].realloc = AlignedAllocRealloc,
+	[MCTX_ALIGNED_REDIRECT_ID].reset = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].delete_context = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].get_chunk_context = AlignedAllocGetChunkContext,
+	[MCTX_ALIGNED_REDIRECT_ID].get_chunk_space = AlignedGetChunkSpace,
+	[MCTX_ALIGNED_REDIRECT_ID].is_empty = NULL, /* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].stats = NULL,	/* not required */
+#ifdef MEMORY_CONTEXT_CHECKING
+	[MCTX_ALIGNED_REDIRECT_ID].check = NULL,	/* no required */
+#endif
+
+
 	/*
 	 * Unused (as yet) IDs should have dummy entries here.  This allows us to
 	 * fail cleanly if a bogus pointer is passed to pfree or the like.  It
@@ -110,11 +126,6 @@ static const MemoryContextMethods mcxt_methods[] = {
 	[MCTX_UNUSED4_ID].realloc = BogusRealloc,
 	[MCTX_UNUSED4_ID].get_chunk_context = BogusGetChunkContext,
 	[MCTX_UNUSED4_ID].get_chunk_space = BogusGetChunkSpace,
-
-	[MCTX_UNUSED5_ID].free_p = BogusFree,
-	[MCTX_UNUSED5_ID].realloc = BogusRealloc,
-	[MCTX_UNUSED5_ID].get_chunk_context = BogusGetChunkContext,
-	[MCTX_UNUSED5_ID].get_chunk_space = BogusGetChunkSpace,
 };
 
 /*
@@ -1298,6 +1309,92 @@ palloc_extended(Size size, int flags)
 	return ret;
 }
 
+/*
+ * MemoryContextAllocAligned
+ *		Allocate 'size' bytes of memory in 'context' aligned to 'alignto'
+ *		bytes.
+ *
+ * 'flags' may be 0 or set the same as MemoryContextAllocExtended().
+ * 'alignto' must be a power of 2.
+ */
+void *
+MemoryContextAllocAligned(MemoryContext context,
+						  Size size, Size alignto, int flags)
+{
+	Size		alloc_size;
+	void	   *unaligned;
+	void	   *aligned;
+
+	/* wouldn't make much sense to waste that much space */
+	Assert(alignto < (128 * 1024 * 1024));
+
+	/* ensure alignto is a power of 2 */
+	Assert((alignto & (alignto - 1)) == 0);
+
+	/*
+	 * If the alignment requirements are less than what we already guarantee
+	 * then just use the standard allocation function.
+	 */
+	if (unlikely(alignto <= MAXIMUM_ALIGNOF))
+		return palloc_extended(size, flags);
+
+	/*
+	 * We implement aligned pointers by simply allocating enough memory for
+	 * the requested size plus the alignment and an additional MemoryChunk.
+	 * This additional MemoryChunk is required for operations such as pfree
+	 * when used on the pointer returned by this function.  We use this
+	 * "redirection" MemoryChunk in order to find the pointer to the memory
+	 * that was returned by the MemoryContextAllocExtended call below. We do
+	 * that by "borrowing" the block offset field and instead of using that to
+	 * find the offset into the owning block, we use it to find the original
+	 * allocated address.
+	 *
+	 * Here we must allocate enough extra memory so that we can still align
+	 * the pointer returned by MemoryContextAllocExtended and also have enough
+	 * space for the redirection MemoryChunk.
+	 */
+	alloc_size = size + alignto + sizeof(MemoryChunk);
+
+	/* perform the actual allocation */
+	unaligned = MemoryContextAllocExtended(context, alloc_size, flags);
+
+	/* set the aligned pointer */
+	aligned = (void *) TYPEALIGN(alignto, (char *) unaligned + sizeof(MemoryChunk));
+
+	/*
+	 * We set the redirect MemoryChunk so that the block offset calculation is
+	 * used to point back to the 'unaligned' allocated chunk.  This allows us
+	 * to use MemoryChunkGetBlock() to find the unaligned chunk when we need
+	 * to perform operations such as pfree() or repalloc().
+	 *
+	 * We store 'alignto' in the MemoryChunk's 'value' so that we know what
+	 * the alignment was set to should we ever be asked to realloc this
+	 * pointer.
+	 */
+	MemoryChunkSetHdrMask(PointerGetMemoryChunk(aligned), unaligned, alignto,
+						  MCTX_ALIGNED_REDIRECT_ID);
+
+	/* XXX: should we adjust valgrind state here? */
+
+	/* double check we produced a correctly aligned pointer */
+	Assert((char *) TYPEALIGN(alignto, aligned) == aligned);
+
+	return aligned;
+}
+
+/*
+ * palloc_aligned
+ *		Allocate 'size' bytes returning a pointer that's aligned to the
+ *		'alignto' boundary.
+ *
+ * 'alignto' must be a power of 2.
+ */
+void *
+palloc_aligned(Size size, Size alignto, int flags)
+{
+	return MemoryContextAllocAligned(CurrentMemoryContext, size, alignto, flags);
+}
+
 /*
  * pfree
  *		Release an allocated chunk.
@@ -1306,11 +1403,16 @@ void
 pfree(void *pointer)
 {
 #ifdef USE_VALGRIND
+	MemoryContextMethodID method = GetMemoryChunkMethodID(pointer);
 	MemoryContext context = GetMemoryChunkContext(pointer);
 #endif
 
 	MCXT_METHOD(pointer, free_p) (pointer);
-	VALGRIND_MEMPOOL_FREE(context, pointer);
+
+#ifdef USE_VALGRIND
+	if (method != MCTX_ALIGNED_REDIRECT_ID)
+		VALGRIND_MEMPOOL_FREE(context, pointer);
+#endif
 }
 
 /*
diff --git a/src/backend/utils/mmgr/meson.build b/src/backend/utils/mmgr/meson.build
index 641bb181ba..7cf4d6dcc8 100644
--- a/src/backend/utils/mmgr/meson.build
+++ b/src/backend/utils/mmgr/meson.build
@@ -1,4 +1,5 @@
 backend_sources += files(
+  'alignedalloc.c',
   'aset.c',
   'dsa.c',
   'freepage.c',
diff --git a/src/include/utils/memutils_internal.h b/src/include/utils/memutils_internal.h
index bc2cbdd506..450bcba3ed 100644
--- a/src/include/utils/memutils_internal.h
+++ b/src/include/utils/memutils_internal.h
@@ -70,6 +70,15 @@ extern void SlabStats(MemoryContext context,
 extern void SlabCheck(MemoryContext context);
 #endif
 
+/*
+ * These functions support the implementation of palloc_aligned() and are not
+ * part of a fully-fledged MemoryContext type.
+ */
+extern void AlignedAllocFree(void *pointer);
+extern void *AlignedAllocRealloc(void *pointer, Size size);
+extern MemoryContext AlignedAllocGetChunkContext(void *pointer);
+extern Size AlignedGetChunkSpace(void *pointer);
+
 /*
  * MemoryContextMethodID
  *		A unique identifier for each MemoryContext implementation which
@@ -92,8 +101,8 @@ typedef enum MemoryContextMethodID
 	MCTX_ASET_ID,
 	MCTX_GENERATION_ID,
 	MCTX_SLAB_ID,
-	MCTX_UNUSED4_ID,			/* available */
-	MCTX_UNUSED5_ID				/* 111 occurs in wipe_mem'd memory */
+	MCTX_ALIGNED_REDIRECT_ID,
+	MCTX_UNUSED4_ID				/* 111 occurs in wipe_mem'd memory */
 } MemoryContextMethodID;
 
 /*
diff --git a/src/include/utils/memutils_memorychunk.h b/src/include/utils/memutils_memorychunk.h
index 2eefc138e3..38702efc58 100644
--- a/src/include/utils/memutils_memorychunk.h
+++ b/src/include/utils/memutils_memorychunk.h
@@ -156,7 +156,7 @@ MemoryChunkSetHdrMask(MemoryChunk *chunk, void *block,
 {
 	Size		blockoffset = (char *) chunk - (char *) block;
 
-	Assert((char *) chunk > (char *) block);
+	Assert((char *) chunk >= (char *) block);
 	Assert(blockoffset <= MEMORYCHUNK_MAX_BLOCKOFFSET);
 	Assert(value <= MEMORYCHUNK_MAX_VALUE);
 	Assert((int) methodid <= MEMORY_CONTEXT_METHODID_MASK);
diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h
index 8eee0e2938..989ddf18ef 100644
--- a/src/include/utils/palloc.h
+++ b/src/include/utils/palloc.h
@@ -73,10 +73,13 @@ extern void *MemoryContextAllocZero(MemoryContext context, Size size);
 extern void *MemoryContextAllocZeroAligned(MemoryContext context, Size size);
 extern void *MemoryContextAllocExtended(MemoryContext context,
 										Size size, int flags);
+extern void *MemoryContextAllocAligned(MemoryContext context,
+									   Size size, Size alignto, int flags);
 
 extern void *palloc(Size size);
 extern void *palloc0(Size size);
 extern void *palloc_extended(Size size, int flags);
+extern void *palloc_aligned(Size size, Size alignto, int flags);
 extern pg_nodiscard void *repalloc(void *pointer, Size size);
 extern pg_nodiscard void *repalloc_extended(void *pointer,
 											Size size, int flags);
#2Maxim Orlov
orlovmg@gmail.com
In reply to: David Rowley (#1)
2 attachment(s)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Hi!

Part of the work that Thomas mentions in [1], regarding Direct I/O,

has certain requirements that pointers must be page-aligned.

I've attached a patch which implements palloc_aligned() and
MemoryContextAllocAligned() ...

I've done a quick look and the patch is looks good to me.
Let's add tests for these functions, should we? If you think this is an
overkill, feel free to trim tests for your taste.

--
Best regards,
Maxim Orlov.

Attachments:

v1-0002-Add-tests-for-palloc_aligned.patchapplication/octet-stream; name=v1-0002-Add-tests-for-palloc_aligned.patchDownload
From 233fafb93efe2daf8e94178e0b51c47ee776f73c Mon Sep 17 00:00:00 2001
From: Maxim Orlov <orlovmg@gmail.com>
Date: Thu, 3 Nov 2022 18:08:35 +0300
Subject: [PATCH v1 2/2] Add tests for palloc_aligned

---
 src/test/regress/expected/misc.out |  14 ++++
 src/test/regress/regress.c         | 101 +++++++++++++++++++++++++++++
 src/test/regress/sql/misc.sql      |  11 ++++
 3 files changed, 126 insertions(+)

diff --git a/src/test/regress/expected/misc.out b/src/test/regress/expected/misc.out
index 6e816c57f1..3ee36d9d68 100644
--- a/src/test/regress/expected/misc.out
+++ b/src/test/regress/expected/misc.out
@@ -396,3 +396,17 @@ SELECT *, (equipment(CAST((h.*) AS hobbies_r))).name FROM hobbies_r h;
 --
 -- rewrite rules
 --
+--
+-- Test aligned memory allocation
+--
+CREATE FUNCTION test_palloc_aligned()
+   RETURNS void
+   AS :'regresslib'
+   LANGUAGE C STRICT;
+SELECT test_palloc_aligned();
+ test_palloc_aligned 
+---------------------
+ 
+(1 row)
+
+DROP FUNCTION test_palloc_aligned();
diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
index 548afb4438..499f6e3ac2 100644
--- a/src/test/regress/regress.c
+++ b/src/test/regress/regress.c
@@ -44,6 +44,7 @@
 #include "utils/geo_decls.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/palloc.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
 
@@ -1257,3 +1258,103 @@ get_columns_length(PG_FUNCTION_ARGS)
 
 	PG_RETURN_INT32(column_offset);
 }
+
+static void
+test_palloc_aligned_validate_and_free(void *p, Size size, Size alignto,
+									  bool check_zeroed)
+{
+	char   *itr;
+	Size	i;
+
+	if (p == NULL)
+		elog(ERROR, "could not allocate %zu byte(s)", size);
+
+	if ((uintptr_t) p % alignto != 0)
+		elog(ERROR, "memory block are not aligned to %zu", alignto);
+
+	if (check_zeroed)
+	{
+		for (i = 0, itr = p; i < size; ++i, ++itr)
+			if (*itr != 0)
+				elog(ERROR, "memory block are not zeroed");
+	}
+
+	pfree(p);
+}
+
+static void
+test_palloc_aligned_alloc_and_test(Size size, Size alignto, int flags)
+{
+	void   *p;
+
+	p = palloc_aligned(size, alignto, flags);
+	test_palloc_aligned_validate_and_free(p, size, alignto,
+										  (flags & MCXT_ALLOC_ZERO) != 0);
+
+	p = MemoryContextAllocAligned(CurrentMemoryContext, size, alignto, flags);
+	test_palloc_aligned_validate_and_free(p, size, alignto,
+										  (flags & MCXT_ALLOC_ZERO) != 0);
+}
+
+/*
+ * Test aligned memory allocation.
+ */
+PG_FUNCTION_INFO_V1(test_palloc_aligned);
+Datum
+test_palloc_aligned(PG_FUNCTION_ARGS)
+{
+	Size	alignments[] = {2, 4, 8, 16, 32},
+			sizes[] = {1, 100, 1000},
+			i,
+			j,
+			alignment,
+			n;
+	void   *p;
+	int		flags;
+
+	for (i = 0; i < lengthof(alignments); ++i)
+	{
+		alignment = alignments[i];
+		/*
+		 * This behavior is implementation defined, but we should not fail here.
+		 */
+		p = palloc_aligned(0, alignment, MCXT_ALLOC_NO_OOM);
+		if (p != NULL)
+			pfree(p);
+
+		n = 0;
+		p = palloc_aligned(0, alignment, MCXT_ALLOC_NO_OOM | MCXT_ALLOC_ZERO);
+		if (p != NULL)
+			pfree(p);
+
+		p = palloc_aligned(0, alignment, MCXT_ALLOC_NO_OOM | MCXT_ALLOC_ZERO |
+										 MCXT_ALLOC_HUGE);
+		if (p != NULL)
+			pfree(p);
+
+		/*
+		 * "Normal" cases.
+		 */
+		for (j = 0; j < lengthof(sizes); ++j)
+		{
+			n = sizes[j];
+			/* No flags */
+			flags = 0;
+			test_palloc_aligned_alloc_and_test(n, alignment, flags);
+
+			/* MCXT_ALLOC_NO_OOM */
+			flags |= MCXT_ALLOC_NO_OOM;
+			test_palloc_aligned_alloc_and_test(n, alignment, flags);
+
+			/* MCXT_ALLOC_NO_OOM | MCXT_ALLOC_ZERO */
+			flags |= MCXT_ALLOC_ZERO;
+			test_palloc_aligned_alloc_and_test(n, alignment, flags);
+
+			/* MCXT_ALLOC_NO_OOM | MCXT_ALLOC_ZERO | MCXT_ALLOC_HUGE */
+			flags |= MCXT_ALLOC_HUGE;
+			test_palloc_aligned_alloc_and_test(n, alignment, flags);
+		}
+	}
+
+	PG_RETURN_VOID();
+}
diff --git a/src/test/regress/sql/misc.sql b/src/test/regress/sql/misc.sql
index 165a2e175f..43f3bb3476 100644
--- a/src/test/regress/sql/misc.sql
+++ b/src/test/regress/sql/misc.sql
@@ -273,3 +273,14 @@ SELECT *, (equipment(CAST((h.*) AS hobbies_r))).name FROM hobbies_r h;
 --
 -- rewrite rules
 --
+
+--
+-- Test aligned memory allocation
+--
+CREATE FUNCTION test_palloc_aligned()
+   RETURNS void
+   AS :'regresslib'
+   LANGUAGE C STRICT;
+
+SELECT test_palloc_aligned();
+DROP FUNCTION test_palloc_aligned();
-- 
2.37.0 (Apple Git-136)

v1-0001-palloc_aligned.patchapplication/octet-stream; name=v1-0001-palloc_aligned.patchDownload
From 2c59401a617c2e8f0fd07d4b02df45c46ce8f83f Mon Sep 17 00:00:00 2001
From: Maxim Orlov <orlovmg@gmail.com>
Date: Thu, 3 Nov 2022 15:33:50 +0300
Subject: [PATCH v1 1/2] palloc_aligned

---
 src/backend/utils/mmgr/Makefile          |   1 +
 src/backend/utils/mmgr/alignedalloc.c    |  93 ++++++++++++++++++
 src/backend/utils/mmgr/mcxt.c            | 114 +++++++++++++++++++++--
 src/backend/utils/mmgr/meson.build       |   1 +
 src/include/utils/memutils_internal.h    |  13 ++-
 src/include/utils/memutils_memorychunk.h |   2 +-
 src/include/utils/palloc.h               |   3 +
 7 files changed, 218 insertions(+), 9 deletions(-)
 create mode 100644 src/backend/utils/mmgr/alignedalloc.c

diff --git a/src/backend/utils/mmgr/Makefile b/src/backend/utils/mmgr/Makefile
index 3b4cfdbd52..dae3432c98 100644
--- a/src/backend/utils/mmgr/Makefile
+++ b/src/backend/utils/mmgr/Makefile
@@ -13,6 +13,7 @@ top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
 OBJS = \
+	alignedalloc.o \
 	aset.o \
 	dsa.o \
 	freepage.o \
diff --git a/src/backend/utils/mmgr/alignedalloc.c b/src/backend/utils/mmgr/alignedalloc.c
new file mode 100644
index 0000000000..e581772758
--- /dev/null
+++ b/src/backend/utils/mmgr/alignedalloc.c
@@ -0,0 +1,93 @@
+/*-------------------------------------------------------------------------
+ *
+ * alignedalloc.c
+ *	  Allocator functions to implement palloc_aligned
+ *
+ * This is not a fully fledged MemoryContext type as there is no means to
+ * create a MemoryContext of this type.  The code here only serves to allow
+ * operations such as pfree() and repalloc() to work correctly on a memory
+ * chunk that was allocated by palloc_align().
+ *
+ * Portions Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/mmgr/alignedalloc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/memdebug.h"
+#include "utils/memutils_memorychunk.h"
+
+void
+AlignedAllocFree(void *pointer)
+{
+	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
+	void *unaligned;
+
+	Assert(!MemoryChunkIsExternal(chunk));
+
+	unaligned = MemoryChunkGetBlock(chunk);
+
+	pfree(unaligned);
+}
+
+void *
+AlignedAllocRealloc(void *pointer, Size size)
+{
+	MemoryChunk	   *redirchunk = PointerGetMemoryChunk(pointer);
+	Size		alignto = MemoryChunkGetValue(redirchunk);
+	void	   *unaligned = MemoryChunkGetBlock(redirchunk);
+	MemoryChunk	   *chunk = PointerGetMemoryChunk(unaligned);
+	Size			old_size;
+	void		   *newptr;
+
+	/* sanity check this is a power of 2 value */
+	Assert((alignto & (alignto - 1)) == 0);
+
+	/*
+	 * Determine the size of the original allocation.  We can't determine this
+	 * exactly as GetMemoryChunkSpace() returns the total space used for the
+	 * allocation, which for contexts like aset includes rounding up to the
+	 * next power of 2.  However, this value is just used to memcpy() the old
+	 * data into the new allocation, so we only need to concern ourselves with
+	 * not reading beyond the end of the original allocation's memory.  The
+	 * drawback here is that we may copy more bytes than we need to, which
+	 * amounts only to wasted effort.
+	 */
+	old_size = GetMemoryChunkSpace(unaligned) -
+		((char *)pointer - (char *)chunk);
+
+	newptr = palloc_aligned(size, alignto, 0);
+
+	/*
+	 * We may memcpy beyond the end of the orignal allocation request size, so
+	 * we must mark the entire allocation as defined.
+	 */
+	VALGRIND_MAKE_MEM_DEFINED(pointer, old_size);
+	memcpy(newptr, pointer, Min(size, old_size));
+	pfree(unaligned);
+
+	return newptr;
+}
+
+MemoryContext
+AlignedAllocGetChunkContext(void *pointer)
+{
+	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
+
+	Assert(!MemoryChunkIsExternal(chunk));
+
+	return GetMemoryChunkContext(MemoryChunkGetBlock(chunk));
+}
+
+Size
+AlignedGetChunkSpace(void *pointer)
+{
+	MemoryChunk	   *redirchunk = PointerGetMemoryChunk(pointer);
+	void	   *unaligned = MemoryChunkGetBlock(redirchunk);
+
+	return GetMemoryChunkSpace(unaligned);
+}
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index f526ca82c1..cd2c43efb3 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -30,6 +30,7 @@
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 #include "utils/memutils_internal.h"
+#include "utils/memutils_memorychunk.h"
 
 
 static void BogusFree(void *pointer);
@@ -84,6 +85,21 @@ static const MemoryContextMethods mcxt_methods[] = {
 	[MCTX_SLAB_ID].check = SlabCheck,
 #endif
 
+	/* alignedalloc.c */
+	[MCTX_ALIGNED_REDIRECT_ID].alloc = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].free_p = AlignedAllocFree,
+	[MCTX_ALIGNED_REDIRECT_ID].realloc = AlignedAllocRealloc,
+	[MCTX_ALIGNED_REDIRECT_ID].reset = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].delete_context = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].get_chunk_context = AlignedAllocGetChunkContext,
+	[MCTX_ALIGNED_REDIRECT_ID].get_chunk_space = AlignedGetChunkSpace,
+	[MCTX_ALIGNED_REDIRECT_ID].is_empty = NULL, /* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].stats = NULL,	/* not required */
+#ifdef MEMORY_CONTEXT_CHECKING
+	[MCTX_ALIGNED_REDIRECT_ID].check = NULL,	/* no required */
+#endif
+
+
 	/*
 	 * Unused (as yet) IDs should have dummy entries here.  This allows us to
 	 * fail cleanly if a bogus pointer is passed to pfree or the like.  It
@@ -110,11 +126,6 @@ static const MemoryContextMethods mcxt_methods[] = {
 	[MCTX_UNUSED4_ID].realloc = BogusRealloc,
 	[MCTX_UNUSED4_ID].get_chunk_context = BogusGetChunkContext,
 	[MCTX_UNUSED4_ID].get_chunk_space = BogusGetChunkSpace,
-
-	[MCTX_UNUSED5_ID].free_p = BogusFree,
-	[MCTX_UNUSED5_ID].realloc = BogusRealloc,
-	[MCTX_UNUSED5_ID].get_chunk_context = BogusGetChunkContext,
-	[MCTX_UNUSED5_ID].get_chunk_space = BogusGetChunkSpace,
 };
 
 /*
@@ -1298,6 +1309,92 @@ palloc_extended(Size size, int flags)
 	return ret;
 }
 
+/*
+ * MemoryContextAllocAligned
+ *		Allocate 'size' bytes of memory in 'context' aligned to 'alignto'
+ *		bytes.
+ *
+ * 'flags' may be 0 or set the same as MemoryContextAllocExtended().
+ * 'alignto' must be a power of 2.
+ */
+void *
+MemoryContextAllocAligned(MemoryContext context,
+						  Size size, Size alignto, int flags)
+{
+	Size		alloc_size;
+	void	   *unaligned;
+	void	   *aligned;
+
+	/* wouldn't make much sense to waste that much space */
+	Assert(alignto < (128 * 1024 * 1024));
+
+	/* ensure alignto is a power of 2 */
+	Assert((alignto & (alignto - 1)) == 0);
+
+	/*
+	 * If the alignment requirements are less than what we already guarantee
+	 * then just use the standard allocation function.
+	 */
+	if (unlikely(alignto <= MAXIMUM_ALIGNOF))
+		return palloc_extended(size, flags);
+
+	/*
+	 * We implement aligned pointers by simply allocating enough memory for
+	 * the requested size plus the alignment and an additional MemoryChunk.
+	 * This additional MemoryChunk is required for operations such as pfree
+	 * when used on the pointer returned by this function.  We use this
+	 * "redirection" MemoryChunk in order to find the pointer to the memory
+	 * that was returned by the MemoryContextAllocExtended call below. We do
+	 * that by "borrowing" the block offset field and instead of using that to
+	 * find the offset into the owning block, we use it to find the original
+	 * allocated address.
+	 *
+	 * Here we must allocate enough extra memory so that we can still align
+	 * the pointer returned by MemoryContextAllocExtended and also have enough
+	 * space for the redirection MemoryChunk.
+	 */
+	alloc_size = size + alignto + sizeof(MemoryChunk);
+
+	/* perform the actual allocation */
+	unaligned = MemoryContextAllocExtended(context, alloc_size, flags);
+
+	/* set the aligned pointer */
+	aligned = (void *) TYPEALIGN(alignto, (char *) unaligned + sizeof(MemoryChunk));
+
+	/*
+	 * We set the redirect MemoryChunk so that the block offset calculation is
+	 * used to point back to the 'unaligned' allocated chunk.  This allows us
+	 * to use MemoryChunkGetBlock() to find the unaligned chunk when we need
+	 * to perform operations such as pfree() or repalloc().
+	 *
+	 * We store 'alignto' in the MemoryChunk's 'value' so that we know what
+	 * the alignment was set to should we ever be asked to realloc this
+	 * pointer.
+	 */
+	MemoryChunkSetHdrMask(PointerGetMemoryChunk(aligned), unaligned, alignto,
+						  MCTX_ALIGNED_REDIRECT_ID);
+
+	/* XXX: should we adjust valgrind state here? */
+
+	/* double check we produced a correctly aligned pointer */
+	Assert((char *) TYPEALIGN(alignto, aligned) == aligned);
+
+	return aligned;
+}
+
+/*
+ * palloc_aligned
+ *		Allocate 'size' bytes returning a pointer that's aligned to the
+ *		'alignto' boundary.
+ *
+ * 'alignto' must be a power of 2.
+ */
+void *
+palloc_aligned(Size size, Size alignto, int flags)
+{
+	return MemoryContextAllocAligned(CurrentMemoryContext, size, alignto, flags);
+}
+
 /*
  * pfree
  *		Release an allocated chunk.
@@ -1306,11 +1403,16 @@ void
 pfree(void *pointer)
 {
 #ifdef USE_VALGRIND
+	MemoryContextMethodID method = GetMemoryChunkMethodID(pointer);
 	MemoryContext context = GetMemoryChunkContext(pointer);
 #endif
 
 	MCXT_METHOD(pointer, free_p) (pointer);
-	VALGRIND_MEMPOOL_FREE(context, pointer);
+
+#ifdef USE_VALGRIND
+	if (method != MCTX_ALIGNED_REDIRECT_ID)
+		VALGRIND_MEMPOOL_FREE(context, pointer);
+#endif
 }
 
 /*
diff --git a/src/backend/utils/mmgr/meson.build b/src/backend/utils/mmgr/meson.build
index 641bb181ba..7cf4d6dcc8 100644
--- a/src/backend/utils/mmgr/meson.build
+++ b/src/backend/utils/mmgr/meson.build
@@ -1,4 +1,5 @@
 backend_sources += files(
+  'alignedalloc.c',
   'aset.c',
   'dsa.c',
   'freepage.c',
diff --git a/src/include/utils/memutils_internal.h b/src/include/utils/memutils_internal.h
index bc2cbdd506..450bcba3ed 100644
--- a/src/include/utils/memutils_internal.h
+++ b/src/include/utils/memutils_internal.h
@@ -70,6 +70,15 @@ extern void SlabStats(MemoryContext context,
 extern void SlabCheck(MemoryContext context);
 #endif
 
+/*
+ * These functions support the implementation of palloc_aligned() and are not
+ * part of a fully-fledged MemoryContext type.
+ */
+extern void AlignedAllocFree(void *pointer);
+extern void *AlignedAllocRealloc(void *pointer, Size size);
+extern MemoryContext AlignedAllocGetChunkContext(void *pointer);
+extern Size AlignedGetChunkSpace(void *pointer);
+
 /*
  * MemoryContextMethodID
  *		A unique identifier for each MemoryContext implementation which
@@ -92,8 +101,8 @@ typedef enum MemoryContextMethodID
 	MCTX_ASET_ID,
 	MCTX_GENERATION_ID,
 	MCTX_SLAB_ID,
-	MCTX_UNUSED4_ID,			/* available */
-	MCTX_UNUSED5_ID				/* 111 occurs in wipe_mem'd memory */
+	MCTX_ALIGNED_REDIRECT_ID,
+	MCTX_UNUSED4_ID				/* 111 occurs in wipe_mem'd memory */
 } MemoryContextMethodID;
 
 /*
diff --git a/src/include/utils/memutils_memorychunk.h b/src/include/utils/memutils_memorychunk.h
index 2eefc138e3..38702efc58 100644
--- a/src/include/utils/memutils_memorychunk.h
+++ b/src/include/utils/memutils_memorychunk.h
@@ -156,7 +156,7 @@ MemoryChunkSetHdrMask(MemoryChunk *chunk, void *block,
 {
 	Size		blockoffset = (char *) chunk - (char *) block;
 
-	Assert((char *) chunk > (char *) block);
+	Assert((char *) chunk >= (char *) block);
 	Assert(blockoffset <= MEMORYCHUNK_MAX_BLOCKOFFSET);
 	Assert(value <= MEMORYCHUNK_MAX_VALUE);
 	Assert((int) methodid <= MEMORY_CONTEXT_METHODID_MASK);
diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h
index 8eee0e2938..989ddf18ef 100644
--- a/src/include/utils/palloc.h
+++ b/src/include/utils/palloc.h
@@ -73,10 +73,13 @@ extern void *MemoryContextAllocZero(MemoryContext context, Size size);
 extern void *MemoryContextAllocZeroAligned(MemoryContext context, Size size);
 extern void *MemoryContextAllocExtended(MemoryContext context,
 										Size size, int flags);
+extern void *MemoryContextAllocAligned(MemoryContext context,
+									   Size size, Size alignto, int flags);
 
 extern void *palloc(Size size);
 extern void *palloc0(Size size);
 extern void *palloc_extended(Size size, int flags);
+extern void *palloc_aligned(Size size, Size alignto, int flags);
 extern pg_nodiscard void *repalloc(void *pointer, Size size);
 extern pg_nodiscard void *repalloc_extended(void *pointer,
 											Size size, int flags);
-- 
2.37.0 (Apple Git-136)

#3Andres Freund
andres@anarazel.de
In reply to: David Rowley (#1)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Hi,

On 2022-11-02 00:28:46 +1300, David Rowley wrote:

Part of the work that Thomas mentions in [1], regarding Direct I/O,
diff --git a/src/backend/utils/mmgr/alignedalloc.c b/src/backend/utils/mmgr/alignedalloc.c
new file mode 100644
index 0000000000..e581772758
--- /dev/null
+++ b/src/backend/utils/mmgr/alignedalloc.c
@@ -0,0 +1,93 @@
+/*-------------------------------------------------------------------------
+ *
+ * alignedalloc.c
+ *	  Allocator functions to implement palloc_aligned

FWIW, to me this is really more part of mcxt.c than its own
allocator... Particularly because MemoryContextAllocAligned() et al are
implemented there.

+void *
+AlignedAllocRealloc(void *pointer, Size size)

I doubtthere's ever a need to realloc such a pointer? Perhaps we could just
elog(ERROR)?

+/*
+ * MemoryContextAllocAligned
+ *		Allocate 'size' bytes of memory in 'context' aligned to 'alignto'
+ *		bytes.
+ *
+ * 'flags' may be 0 or set the same as MemoryContextAllocExtended().
+ * 'alignto' must be a power of 2.
+ */
+void *
+MemoryContextAllocAligned(MemoryContext context,
+						  Size size, Size alignto, int flags)
+{
+	Size		alloc_size;
+	void	   *unaligned;
+	void	   *aligned;
+
+	/* wouldn't make much sense to waste that much space */
+	Assert(alignto < (128 * 1024 * 1024));
+
+	/* ensure alignto is a power of 2 */
+	Assert((alignto & (alignto - 1)) == 0);

Hm, not that I can see a case for ever not using a power of two
alignment... There's not really a "need" for the restriction, right? Perhaps
we should note that?

+	/*
+	 * We implement aligned pointers by simply allocating enough memory for
+	 * the requested size plus the alignment and an additional MemoryChunk.
+	 * This additional MemoryChunk is required for operations such as pfree
+	 * when used on the pointer returned by this function.  We use this
+	 * "redirection" MemoryChunk in order to find the pointer to the memory
+	 * that was returned by the MemoryContextAllocExtended call below. We do
+	 * that by "borrowing" the block offset field and instead of using that to
+	 * find the offset into the owning block, we use it to find the original
+	 * allocated address.
+	 *
+	 * Here we must allocate enough extra memory so that we can still align
+	 * the pointer returned by MemoryContextAllocExtended and also have enough
+	 * space for the redirection MemoryChunk.
+	 */
+	alloc_size = size + alignto + sizeof(MemoryChunk);
+
+	/* perform the actual allocation */
+	unaligned = MemoryContextAllocExtended(context, alloc_size, flags);

Should we handle the case where we get a suitably aligned pointer from
MemoryContextAllocExtended() differently?

+ /* XXX: should we adjust valgrind state here? */

Probably still need to do this... Kinda hard to get right without the code
getting exercised. Wonder if there's some minimal case we could actually use
it for?

Thanks,

Andres

#4David Rowley
dgrowleyml@gmail.com
In reply to: Andres Freund (#3)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Thanks for having a look.

On Tue, 8 Nov 2022 at 05:24, Andres Freund <andres@anarazel.de> wrote:

FWIW, to me this is really more part of mcxt.c than its own
allocator... Particularly because MemoryContextAllocAligned() et al are
implemented there.

I'm on the fence about this one. I thought it was nice that we have a
file per consumed MemoryContextMethodID. The thing that caused me to
add alignedalloc.c was the various other comments in the declaration
of mcxt_methods[] that mention where to find each of the methods being
assigned in that array (e.g /* aset.c */). I can change it back if
you feel strongly. I just don't.

I doubtthere's ever a need to realloc such a pointer? Perhaps we could just
elog(ERROR)?

Are you maybe locked into just thinking about your current planned use
case that we want to allocate BLCKSZ bytes in each case? It does not
seem impossible to me that someone will want something more than an
8-byte alignment and also might want to enlarge the allocation at some
point. I thought it might be more dangerous not to implement repalloc.
It might not be clear to someone using palloc_aligned() that there's
no code path that can call repalloc on the returned pointer.

Hm, not that I can see a case for ever not using a power of two
alignment... There's not really a "need" for the restriction, right? Perhaps
we should note that?

TYPEALIGN() will not work correctly unless the alignment is a power of
2. We could modify it to, but that would require doing some modular
maths instead of bitmasking. That seems pretty horrible when the macro
is given a value that's not constant at compile time as we'd end up
with a (slow) divide in the code path. I think the restriction is a
good idea. I imagine there will never be any need to align to anything
that's not a power of 2.

Should we handle the case where we get a suitably aligned pointer from
MemoryContextAllocExtended() differently?

Maybe it would be worth the extra check. I'm trying to imagine future
use cases. Maybe if someone wanted to ensure that we're aligned to
CPU cache line boundaries then the chances of the pointer already
being aligned to 64 bytes is decent enough. The problem is it that
it's too late to save any memory, it just saves a bit of boxing and
unboxing of the redirect headers.

+ /* XXX: should we adjust valgrind state here? */

Probably still need to do this... Kinda hard to get right without the code
getting exercised.

Yeah, that comment kept catching my eye. I agree. That should be
handled correctly. I'll work on that.

Wonder if there's some minimal case we could actually use
it for?

Is there anything we could align to CPU cacheline size that would
speed something up?

David

#5David Rowley
dgrowleyml@gmail.com
In reply to: Maxim Orlov (#2)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Fri, 4 Nov 2022 at 04:20, Maxim Orlov <orlovmg@gmail.com> wrote:

I've done a quick look and the patch is looks good to me.
Let's add tests for these functions, should we? If you think this is an overkill, feel free to trim tests for your taste.

Thanks for doing that. I'm keen to wait a bit and see if we can come
up with a core user of this before adding a test module. However, if
we keep the repalloc() implementation, then I wonder if it might be
worth having a test module for that. I see you're not testing it in
the one you've written. Andres has suggested I remove the repalloc
stuff I added but see my reply to that. I think we should keep it
based on the fact that someone using palloc_aligned might have no idea
if some other code path can call repalloc() on the returned pointer.

David

#6John Naylor
john.naylor@enterprisedb.com
In reply to: David Rowley (#4)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Tue, Nov 8, 2022 at 8:57 AM David Rowley <dgrowleyml@gmail.com> wrote:

Is there anything we could align to CPU cacheline size that would
speed something up?

InitCatCache() already has this, which could benefit from simpler notation.

/*
* Allocate a new cache structure, aligning to a cacheline boundary
*
* Note: we rely on zeroing to initialize all the dlist headers correctly
*/
sz = sizeof(CatCache) + PG_CACHE_LINE_SIZE;
cp = (CatCache *) CACHELINEALIGN(palloc0(sz));

--
John Naylor
EDB: http://www.enterprisedb.com

#7Andres Freund
andres@anarazel.de
In reply to: David Rowley (#5)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Hi,

On 2022-11-08 15:01:18 +1300, David Rowley wrote:

Andres has suggested I remove the repalloc stuff I added but see my reply to
that.

I'm fine with keeping it, I just couldn't really think of cases that have
strict alignment requirements but also requires resizing.

Greetings,

Andres Freund

#8David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#4)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Tue, 8 Nov 2022 at 14:57, David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 8 Nov 2022 at 05:24, Andres Freund <andres@anarazel.de> wrote:

Should we handle the case where we get a suitably aligned pointer from
MemoryContextAllocExtended() differently?

Maybe it would be worth the extra check. I'm trying to imagine future
use cases. Maybe if someone wanted to ensure that we're aligned to
CPU cache line boundaries then the chances of the pointer already
being aligned to 64 bytes is decent enough. The problem is it that
it's too late to save any memory, it just saves a bit of boxing and
unboxing of the redirect headers.

Thinking about that a bit more, if we keep the repalloc support then
we can't do this as if we happen to get the right alignment by chance
during the palloc_aligned, then if we don't have the redirection
MemoryChunk, then we've no way to ensure we keep the alignment after a
repalloc.

David

#9David Rowley
dgrowleyml@gmail.com
In reply to: John Naylor (#6)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Tue, 8 Nov 2022 at 15:17, John Naylor <john.naylor@enterprisedb.com> wrote:

On Tue, Nov 8, 2022 at 8:57 AM David Rowley <dgrowleyml@gmail.com> wrote:

Is there anything we could align to CPU cacheline size that would
speed something up?

InitCatCache() already has this, which could benefit from simpler notation.

Thanks. I wasn't aware. I'll convert that to use palloc_aligned in the patch.

David

#10John Naylor
john.naylor@enterprisedb.com
In reply to: David Rowley (#4)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Tue, Nov 8, 2022 at 8:57 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 8 Nov 2022 at 05:24, Andres Freund <andres@anarazel.de> wrote:

I doubtthere's ever a need to realloc such a pointer? Perhaps we could

just

elog(ERROR)?

Are you maybe locked into just thinking about your current planned use
case that we want to allocate BLCKSZ bytes in each case? It does not
seem impossible to me that someone will want something more than an
8-byte alignment and also might want to enlarge the allocation at some
point. I thought it might be more dangerous not to implement repalloc.
It might not be clear to someone using palloc_aligned() that there's
no code path that can call repalloc on the returned pointer.

I can imagine a use case for arrays of cacheline-sized objects.

TYPEALIGN() will not work correctly unless the alignment is a power of
2. We could modify it to, but that would require doing some modular
maths instead of bitmasking. That seems pretty horrible when the macro
is given a value that's not constant at compile time as we'd end up
with a (slow) divide in the code path. I think the restriction is a
good idea. I imagine there will never be any need to align to anything
that's not a power of 2.

+1

Should we handle the case where we get a suitably aligned pointer from
MemoryContextAllocExtended() differently?

Maybe it would be worth the extra check. I'm trying to imagine future
use cases. Maybe if someone wanted to ensure that we're aligned to
CPU cache line boundaries then the chances of the pointer already
being aligned to 64 bytes is decent enough. The problem is it that
it's too late to save any memory, it just saves a bit of boxing and
unboxing of the redirect headers.

To my mind the main point of detecting this case is to save memory, so if
that's not possible/convenient, special-casing doesn't seem worth it.

- Assert((char *) chunk > (char *) block);
+ Assert((char *) chunk >= (char *) block);

Is this related or independent?

--
John Naylor
EDB: http://www.enterprisedb.com

#11Andres Freund
andres@anarazel.de
In reply to: David Rowley (#4)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Hi,

On 2022-11-08 14:57:35 +1300, David Rowley wrote:

On Tue, 8 Nov 2022 at 05:24, Andres Freund <andres@anarazel.de> wrote:

Should we handle the case where we get a suitably aligned pointer from
MemoryContextAllocExtended() differently?

Maybe it would be worth the extra check. I'm trying to imagine future
use cases. Maybe if someone wanted to ensure that we're aligned to
CPU cache line boundaries then the chances of the pointer already
being aligned to 64 bytes is decent enough. The problem is it that
it's too late to save any memory, it just saves a bit of boxing and
unboxing of the redirect headers.

Couldn't we reduce the amount of over-allocation by a small amount by special
casing the already-aligned case? That's not going to be relevant for page size
aligne allocations, but for smaller alignment values it could matter.

Greetings,

Andres Freund

#12David Rowley
dgrowleyml@gmail.com
In reply to: Andres Freund (#11)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Tue, 15 Nov 2022 at 11:11, Andres Freund <andres@anarazel.de> wrote:

Couldn't we reduce the amount of over-allocation by a small amount by special
casing the already-aligned case? That's not going to be relevant for page size
aligne allocations, but for smaller alignment values it could matter.

I don't quite follow this. How can we know the allocation is already
aligned without performing the allocation? To perform the allocation
we must tell palloc what size to allocate. So, we've already wasted
the space by the time we can tell if the allocation is aligned to what
we need.

Aside from that, there's already a special case for alignto <=
MAXIMUM_ALIGNOF. But we know no palloc will ever return anything
aligned less than that in all cases, which is why that can work.

David

#13David Rowley
dgrowleyml@gmail.com
In reply to: John Naylor (#10)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Mon, 14 Nov 2022 at 15:25, John Naylor <john.naylor@enterprisedb.com> wrote:

- Assert((char *) chunk > (char *) block);
+ Assert((char *) chunk >= (char *) block);

Is this related or independent?

It's related. Because the code is doing:

MemoryChunkSetHdrMask(alignedchunk, unaligned, alignto,
MCTX_ALIGNED_REDIRECT_ID);

Here the blockoffset gets set to the difference between alignedchunk
and unaligned. Typically when we call MemoryChunkSetHdrMask, the
blockoffset is always the difference between the block and
MemoryChunk, which is never 0 due to the block header fields. Here it
can be the same pointer when the redirection MemoryChunk is stored on
the first byte of the palloc'd address. This can happen if the
address returned by palloc + sizeof(MemoryChunk) is aligned to what we
need already.

David

#14Andres Freund
andres@anarazel.de
In reply to: David Rowley (#12)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Hi,

On 2022-11-15 23:36:53 +1300, David Rowley wrote:

On Tue, 15 Nov 2022 at 11:11, Andres Freund <andres@anarazel.de> wrote:

Couldn't we reduce the amount of over-allocation by a small amount by special
casing the already-aligned case? That's not going to be relevant for page size
aligne allocations, but for smaller alignment values it could matter.

I don't quite follow this. How can we know the allocation is already
aligned without performing the allocation? To perform the allocation
we must tell palloc what size to allocate. So, we've already wasted
the space by the time we can tell if the allocation is aligned to what
we need.

What I mean is that we perhaps could over-allocate by a bit less than
alignto + sizeof(MemoryChunk)
If the value returned by the underlying memory context is already aligned to
the correct value, we can just return it as-is.

We already rely on memory context returning MAXIMUM_ALIGNOF aligned
allocations. Adding the special case, I think, means that the we could safely
over-allocate by "only"
alignto + sizeof(MemoryChunk) - MAXIMUM_ALIGNOF

Which would be a reasonable win for small allocations with a small >
MAXIMUM_ALIGNOF alignment. But I don't think that'll be a very common case?

Aside from that, there's already a special case for alignto <=
MAXIMUM_ALIGNOF. But we know no palloc will ever return anything
aligned less than that in all cases, which is why that can work.

Yep.

Greetings,

Andres Freund

#15Greg Stark
stark@mit.edu
In reply to: David Rowley (#1)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

So I think it's kind of cute that you've implemented these as agnostic
wrappers that work with any allocator ... but why?

I would have expected the functionality to just be added directly to
the allocator to explicitly request whole aligned pages which IIRC
it's already capable of doing but just doesn't have any way to
explicitly request.

DirectIO doesn't really need a wide variety of allocation sizes or
alignments, it's always going to be the physical block size which
apparently can be as low as 512 bytes but I'm guessing we're always
going to be using 4kB alignment and multiples of 8kB allocations.
Wouldn't just having a pool of 8kB pages all aligned on 4kB or 8kB
alignment be simpler and more efficient than working around misaligned
pointers and having all these branches and arithmetic happening?

#16Andres Freund
andres@anarazel.de
In reply to: Greg Stark (#15)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

Hi,

On 2022-11-15 16:58:10 -0500, Greg Stark wrote:

So I think it's kind of cute that you've implemented these as agnostic
wrappers that work with any allocator ... but why?

I would have expected the functionality to just be added directly to
the allocator to explicitly request whole aligned pages which IIRC
it's already capable of doing but just doesn't have any way to
explicitly request.

We'd need to support it in multiple allocators, and they'd code quite similar
to this because for allocations that go directly to malloc.

It's possible that we'd want to add optional support for aligned allocations
to e.g. aset.c but not other allocators - this patch would allow to add
support for that transparently.

DirectIO doesn't really need a wide variety of allocation sizes or
alignments, it's always going to be the physical block size which
apparently can be as low as 512 bytes but I'm guessing we're always
going to be using 4kB alignment and multiples of 8kB allocations.

Yep - I posted numbers in some other thread showing that using a larger
alignment is a good idea.

Wouldn't just having a pool of 8kB pages all aligned on 4kB or 8kB
alignment be simpler and more efficient than working around misaligned
pointers and having all these branches and arithmetic happening?

I'm quite certain it'd increase memory usage, rather than reduce it - there's
not actually a whole lot of places that need aligned pages outside of bufmgr,
so the pool would just be unused most of the time. And you'd need special code
to return those pages to the pool when the operation using the aligned buffer
fails - whereas integrating with memory contexts already takes care of
that. Lastly, there's other places where we can benefit from aligned
allocations far smaller than 4kB (most typically cacheline aligned, I'd
guess).

Greetings,

Andres Freund

#17David Rowley
dgrowleyml@gmail.com
In reply to: Andres Freund (#14)
2 attachment(s)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Wed, 16 Nov 2022 at 08:19, Andres Freund <andres@anarazel.de> wrote:

We already rely on memory context returning MAXIMUM_ALIGNOF aligned
allocations. Adding the special case, I think, means that the we could safely
over-allocate by "only"
alignto + sizeof(MemoryChunk) - MAXIMUM_ALIGNOF

Which would be a reasonable win for small allocations with a small >
MAXIMUM_ALIGNOF alignment. But I don't think that'll be a very common case?

Seems reasonable. Subtracting MAXIMUM_ALIGNOF doesn't add any
additional run-time cost since it will be constant folded with the
sizeof(MemoryChunk).

I've attached an updated patch. The 0002 is just intended to exercise
these allocations a little bit, it's not intended for commit. I was
using that to ensure valgrind does not complain about anything. It
seems happy now.

David

Attachments:

v2-0001-Add-allocator-support-for-larger-allocation-align.patchtext/plain; charset=US-ASCII; name=v2-0001-Add-allocator-support-for-larger-allocation-align.patchDownload
From 04ab4f1bfbc51ad5c58904ba9e7269d5918a55ee Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Tue, 18 Oct 2022 09:47:45 -0700
Subject: [PATCH v2 1/2] Add allocator support for larger allocation alignment
 & use for IO

---
 src/backend/utils/cache/catcache.c       |   5 +-
 src/backend/utils/mmgr/Makefile          |   1 +
 src/backend/utils/mmgr/alignedalloc.c    | 110 ++++++++++++++++++
 src/backend/utils/mmgr/mcxt.c            | 141 +++++++++++++++++++++--
 src/backend/utils/mmgr/meson.build       |   1 +
 src/include/utils/memutils_internal.h    |  13 ++-
 src/include/utils/memutils_memorychunk.h |   2 +-
 src/include/utils/palloc.h               |   3 +
 8 files changed, 263 insertions(+), 13 deletions(-)
 create mode 100644 src/backend/utils/mmgr/alignedalloc.c

diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 38e943fab2..81c4c52117 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -764,7 +764,6 @@ InitCatCache(int id,
 {
 	CatCache   *cp;
 	MemoryContext oldcxt;
-	size_t		sz;
 	int			i;
 
 	/*
@@ -808,8 +807,8 @@ InitCatCache(int id,
 	 *
 	 * Note: we rely on zeroing to initialize all the dlist headers correctly
 	 */
-	sz = sizeof(CatCache) + PG_CACHE_LINE_SIZE;
-	cp = (CatCache *) CACHELINEALIGN(palloc0(sz));
+	cp = (CatCache *) palloc_aligned(sizeof(CatCache), PG_CACHE_LINE_SIZE,
+									 MCXT_ALLOC_ZERO);
 	cp->cc_bucket = palloc0(nbuckets * sizeof(dlist_head));
 
 	/*
diff --git a/src/backend/utils/mmgr/Makefile b/src/backend/utils/mmgr/Makefile
index 3b4cfdbd52..dae3432c98 100644
--- a/src/backend/utils/mmgr/Makefile
+++ b/src/backend/utils/mmgr/Makefile
@@ -13,6 +13,7 @@ top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
 OBJS = \
+	alignedalloc.o \
 	aset.o \
 	dsa.o \
 	freepage.o \
diff --git a/src/backend/utils/mmgr/alignedalloc.c b/src/backend/utils/mmgr/alignedalloc.c
new file mode 100644
index 0000000000..97cb1d2b0d
--- /dev/null
+++ b/src/backend/utils/mmgr/alignedalloc.c
@@ -0,0 +1,110 @@
+/*-------------------------------------------------------------------------
+ *
+ * alignedalloc.c
+ *	  Allocator functions to implement palloc_aligned
+ *
+ * This is not a fully fledged MemoryContext type as there is no means to
+ * create a MemoryContext of this type.  The code here only serves to allow
+ * operations such as pfree() and repalloc() to work correctly on a memory
+ * chunk that was allocated by palloc_aligned().
+ *
+ * Portions Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/mmgr/alignedalloc.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "utils/memdebug.h"
+#include "utils/memutils_memorychunk.h"
+
+void
+AlignedAllocFree(void *pointer)
+{
+	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
+	void *unaligned;
+
+#ifdef MEMORY_CONTEXT_CHECKING
+	/*
+	 * Test for someone scribbling on unused space in chunk.  We don't have
+	 * the ability to include the context name here, so just mention that it's
+	 * an aligned chunk.
+	 */
+	if (!sentinel_ok(pointer, chunk->requested_size))
+		elog(WARNING, "detected write past %zu-byte aligned chunk end at %p",
+			 MemoryChunkGetValue(chunk), chunk);
+#endif
+
+	Assert(!MemoryChunkIsExternal(chunk));
+
+	/* obtain the original (unaligned) allocated pointer */
+	unaligned = MemoryChunkGetBlock(chunk);
+
+	pfree(unaligned);
+}
+
+void *
+AlignedAllocRealloc(void *pointer, Size size)
+{
+	MemoryChunk	   *redirchunk = PointerGetMemoryChunk(pointer);
+	Size		alignto = MemoryChunkGetValue(redirchunk);
+	void	   *unaligned = MemoryChunkGetBlock(redirchunk);
+	MemoryContext	ctx;
+	Size			old_size;
+	void		   *newptr;
+
+	/* sanity check this is a power of 2 value */
+	Assert((alignto & (alignto - 1)) == 0);
+
+	/*
+	 * Determine the size of the original allocation.  We can't determine this
+	 * exactly as GetMemoryChunkSpace() returns the total space used for the
+	 * allocation, which for contexts like aset includes rounding up to the
+	 * next power of 2.  However, this value is just used to memcpy() the old
+	 * data into the new allocation, so we only need to concern ourselves with
+	 * not reading beyond the end of the original allocation's memory.  The
+	 * drawback here is that we may copy more bytes than we need to, which
+	 * amounts only to wasted effort.
+	 */
+#ifndef MEMORY_CONTEXT_CHECKING
+	old_size = GetMemoryChunkSpace(unaligned) -
+		((char *) pointer - (char *) PointerGetMemoryChunk(unaligned));
+#else
+	old_size = redirchunk->requested_size;
+#endif
+
+	ctx = GetMemoryChunkContext(unaligned);
+	newptr = MemoryContextAllocAligned(ctx, size, alignto, 0);
+
+	/*
+	 * We may memcpy beyond the end of the orignal allocation request size, so
+	 * we must mark the entire allocation as defined.
+	 */
+	VALGRIND_MAKE_MEM_DEFINED(pointer, old_size);
+	memcpy(newptr, pointer, Min(size, old_size));
+	pfree(unaligned);
+
+	return newptr;
+}
+
+MemoryContext
+AlignedAllocGetChunkContext(void *pointer)
+{
+	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
+
+	Assert(!MemoryChunkIsExternal(chunk));
+
+	return GetMemoryChunkContext(MemoryChunkGetBlock(chunk));
+}
+
+Size
+AlignedGetChunkSpace(void *pointer)
+{
+	MemoryChunk	   *redirchunk = PointerGetMemoryChunk(pointer);
+	void	   *unaligned = MemoryChunkGetBlock(redirchunk);
+
+	return GetMemoryChunkSpace(unaligned);
+}
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 57bd6690ca..c1e3e88b49 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -30,6 +30,7 @@
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 #include "utils/memutils_internal.h"
+#include "utils/memutils_memorychunk.h"
 
 
 static void BogusFree(void *pointer);
@@ -84,6 +85,21 @@ static const MemoryContextMethods mcxt_methods[] = {
 	[MCTX_SLAB_ID].check = SlabCheck,
 #endif
 
+	/* alignedalloc.c */
+	[MCTX_ALIGNED_REDIRECT_ID].alloc = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].free_p = AlignedAllocFree,
+	[MCTX_ALIGNED_REDIRECT_ID].realloc = AlignedAllocRealloc,
+	[MCTX_ALIGNED_REDIRECT_ID].reset = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].delete_context = NULL,	/* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].get_chunk_context = AlignedAllocGetChunkContext,
+	[MCTX_ALIGNED_REDIRECT_ID].get_chunk_space = AlignedGetChunkSpace,
+	[MCTX_ALIGNED_REDIRECT_ID].is_empty = NULL, /* not required */
+	[MCTX_ALIGNED_REDIRECT_ID].stats = NULL,	/* not required */
+#ifdef MEMORY_CONTEXT_CHECKING
+	[MCTX_ALIGNED_REDIRECT_ID].check = NULL,	/* not required */
+#endif
+
+
 	/*
 	 * Unused (as yet) IDs should have dummy entries here.  This allows us to
 	 * fail cleanly if a bogus pointer is passed to pfree or the like.  It
@@ -110,11 +126,6 @@ static const MemoryContextMethods mcxt_methods[] = {
 	[MCTX_UNUSED4_ID].realloc = BogusRealloc,
 	[MCTX_UNUSED4_ID].get_chunk_context = BogusGetChunkContext,
 	[MCTX_UNUSED4_ID].get_chunk_space = BogusGetChunkSpace,
-
-	[MCTX_UNUSED5_ID].free_p = BogusFree,
-	[MCTX_UNUSED5_ID].realloc = BogusRealloc,
-	[MCTX_UNUSED5_ID].get_chunk_context = BogusGetChunkContext,
-	[MCTX_UNUSED5_ID].get_chunk_space = BogusGetChunkSpace,
 };
 
 /*
@@ -1298,6 +1309,111 @@ palloc_extended(Size size, int flags)
 	return ret;
 }
 
+/*
+ * MemoryContextAllocAligned
+ *		Allocate 'size' bytes of memory in 'context' aligned to 'alignto'
+ *		bytes.
+ *
+ * 'alignto' must be a power of 2.
+ * 'flags' may be 0 or set the same as MemoryContextAllocExtended().
+ */
+void *
+MemoryContextAllocAligned(MemoryContext context,
+						  Size size, Size alignto, int flags)
+{
+	MemoryChunk *alignedchunk;
+	Size		alloc_size;
+	void	   *unaligned;
+	void	   *aligned;
+
+	/* wouldn't make much sense to waste that much space */
+	Assert(alignto < (128 * 1024 * 1024));
+
+	/* ensure alignto is a power of 2 */
+	Assert((alignto & (alignto - 1)) == 0);
+
+	/*
+	 * If the alignment requirements are less than what we already guarantee
+	 * then just use the standard allocation function.
+	 */
+	if (unlikely(alignto <= MAXIMUM_ALIGNOF))
+		return MemoryContextAllocExtended(context, size, flags);
+
+	/*
+	 * We implement aligned pointers by simply allocating enough memory for
+	 * the requested size plus the alignment and an additional "redirection"
+	 * MemoryChunk.  This additional MemoryChunk is required for operations
+	 * such as pfree when used on the pointer returned by this function.  We
+	 * use this redirection MemoryChunk in order to find the pointer to the
+	 * memory that was returned by the MemoryContextAllocExtended call below.
+	 * We do that by "borrowing" the block offset field and instead of using
+	 * that to find the offset into the owning block, we use it to find the
+	 * original allocated address.
+	 *
+	 * Here we must allocate enough extra memory so that we can still align
+	 * the pointer returned by MemoryContextAllocExtended and also have enough
+	 * space for the redirection MemoryChunk.  Since allocations will already
+	 * be at least aligned by MAXIMUM_ALIGNOF, we can subtract that amount
+	 * from the allocation size to save a little memory.
+	 */
+	alloc_size = size + alignto + sizeof(MemoryChunk) - MAXIMUM_ALIGNOF;
+
+#ifdef MEMORY_CONTEXT_CHECKING
+	/* ensure there's space for a sentinal byte */
+	alloc_size += 1;
+#endif
+
+	/* perform the actual allocation */
+	unaligned = MemoryContextAllocExtended(context, alloc_size, flags);
+
+	/* set the aligned pointer */
+	aligned = (void *) TYPEALIGN(alignto, (char *) unaligned +
+								 sizeof(MemoryChunk));
+
+	alignedchunk = PointerGetMemoryChunk(aligned);
+
+	/*
+	 * We set the redirect MemoryChunk so that the block offset calculation is
+	 * used to point back to the 'unaligned' allocated chunk.  This allows us
+	 * to use MemoryChunkGetBlock() to find the unaligned chunk when we need
+	 * to perform operations such as pfree() and repalloc().
+	 *
+	 * We store 'alignto' in the MemoryChunk's 'value' so that we know what
+	 * the alignment was set to should we ever be asked to realloc this
+	 * pointer.
+	 */
+	MemoryChunkSetHdrMask(alignedchunk, unaligned, alignto,
+						  MCTX_ALIGNED_REDIRECT_ID);
+
+	/* double check we produced a correctly aligned pointer */
+	Assert((char *) TYPEALIGN(alignto, aligned) == aligned);
+
+#ifdef MEMORY_CONTEXT_CHECKING
+	alignedchunk->requested_size = size;
+	/* set mark to catch clobber of "unused" space */
+	set_sentinel(aligned, size);
+#endif
+
+	/* Mark the bytes before the redirection header as noaccess */
+	VALGRIND_MAKE_MEM_NOACCESS(unaligned,
+							   (char *) alignedchunk - (char *) unaligned);
+	return aligned;
+}
+
+/*
+ * palloc_aligned
+ *		Allocate 'size' bytes returning a pointer that's aligned to the
+ *		'alignto' boundary.
+ *
+ * 'alignto' must be a power of 2.
+ * 'flags' may be 0 or set the same as MemoryContextAllocExtended().
+ */
+void *
+palloc_aligned(Size size, Size alignto, int flags)
+{
+	return MemoryContextAllocAligned(CurrentMemoryContext, size, alignto, flags);
+}
+
 /*
  * pfree
  *		Release an allocated chunk.
@@ -1306,11 +1422,16 @@ void
 pfree(void *pointer)
 {
 #ifdef USE_VALGRIND
+	MemoryContextMethodID method = GetMemoryChunkMethodID(pointer);
 	MemoryContext context = GetMemoryChunkContext(pointer);
 #endif
 
 	MCXT_METHOD(pointer, free_p) (pointer);
-	VALGRIND_MEMPOOL_FREE(context, pointer);
+
+#ifdef USE_VALGRIND
+	if (method != MCTX_ALIGNED_REDIRECT_ID)
+		VALGRIND_MEMPOOL_FREE(context, pointer);
+#endif
 }
 
 /*
@@ -1320,6 +1441,9 @@ pfree(void *pointer)
 void *
 repalloc(void *pointer, Size size)
 {
+#ifdef USE_VALGRIND
+	MemoryContextMethodID method = GetMemoryChunkMethodID(pointer);
+#endif
 #if defined(USE_ASSERT_CHECKING) || defined(USE_VALGRIND)
 	MemoryContext context = GetMemoryChunkContext(pointer);
 #endif
@@ -1346,7 +1470,10 @@ repalloc(void *pointer, Size size)
 						   size, cxt->name)));
 	}
 
-	VALGRIND_MEMPOOL_CHANGE(context, pointer, ret, size);
+#ifdef USE_VALGRIND
+	if (method != MCTX_ALIGNED_REDIRECT_ID)
+		VALGRIND_MEMPOOL_CHANGE(context, pointer, ret, size);
+#endif
 
 	return ret;
 }
diff --git a/src/backend/utils/mmgr/meson.build b/src/backend/utils/mmgr/meson.build
index 641bb181ba..7cf4d6dcc8 100644
--- a/src/backend/utils/mmgr/meson.build
+++ b/src/backend/utils/mmgr/meson.build
@@ -1,4 +1,5 @@
 backend_sources += files(
+  'alignedalloc.c',
   'aset.c',
   'dsa.c',
   'freepage.c',
diff --git a/src/include/utils/memutils_internal.h b/src/include/utils/memutils_internal.h
index bc2cbdd506..450bcba3ed 100644
--- a/src/include/utils/memutils_internal.h
+++ b/src/include/utils/memutils_internal.h
@@ -70,6 +70,15 @@ extern void SlabStats(MemoryContext context,
 extern void SlabCheck(MemoryContext context);
 #endif
 
+/*
+ * These functions support the implementation of palloc_aligned() and are not
+ * part of a fully-fledged MemoryContext type.
+ */
+extern void AlignedAllocFree(void *pointer);
+extern void *AlignedAllocRealloc(void *pointer, Size size);
+extern MemoryContext AlignedAllocGetChunkContext(void *pointer);
+extern Size AlignedGetChunkSpace(void *pointer);
+
 /*
  * MemoryContextMethodID
  *		A unique identifier for each MemoryContext implementation which
@@ -92,8 +101,8 @@ typedef enum MemoryContextMethodID
 	MCTX_ASET_ID,
 	MCTX_GENERATION_ID,
 	MCTX_SLAB_ID,
-	MCTX_UNUSED4_ID,			/* available */
-	MCTX_UNUSED5_ID				/* 111 occurs in wipe_mem'd memory */
+	MCTX_ALIGNED_REDIRECT_ID,
+	MCTX_UNUSED4_ID				/* 111 occurs in wipe_mem'd memory */
 } MemoryContextMethodID;
 
 /*
diff --git a/src/include/utils/memutils_memorychunk.h b/src/include/utils/memutils_memorychunk.h
index 2eefc138e3..38702efc58 100644
--- a/src/include/utils/memutils_memorychunk.h
+++ b/src/include/utils/memutils_memorychunk.h
@@ -156,7 +156,7 @@ MemoryChunkSetHdrMask(MemoryChunk *chunk, void *block,
 {
 	Size		blockoffset = (char *) chunk - (char *) block;
 
-	Assert((char *) chunk > (char *) block);
+	Assert((char *) chunk >= (char *) block);
 	Assert(blockoffset <= MEMORYCHUNK_MAX_BLOCKOFFSET);
 	Assert(value <= MEMORYCHUNK_MAX_VALUE);
 	Assert((int) methodid <= MEMORY_CONTEXT_METHODID_MASK);
diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h
index 72d4e70dc6..b1ac63b2ee 100644
--- a/src/include/utils/palloc.h
+++ b/src/include/utils/palloc.h
@@ -73,10 +73,13 @@ extern void *MemoryContextAllocZero(MemoryContext context, Size size);
 extern void *MemoryContextAllocZeroAligned(MemoryContext context, Size size);
 extern void *MemoryContextAllocExtended(MemoryContext context,
 										Size size, int flags);
+extern void *MemoryContextAllocAligned(MemoryContext context,
+									   Size size, Size alignto, int flags);
 
 extern void *palloc(Size size);
 extern void *palloc0(Size size);
 extern void *palloc_extended(Size size, int flags);
+extern void *palloc_aligned(Size size, Size alignto, int flags);
 extern pg_nodiscard void *repalloc(void *pointer, Size size);
 extern pg_nodiscard void *repalloc_extended(void *pointer,
 											Size size, int flags);
-- 
2.35.1.windows.2

v2-0002-Test-code-to-exercise-palloc_aligned-and-repalloc.patchtext/plain; charset=US-ASCII; name=v2-0002-Test-code-to-exercise-palloc_aligned-and-repalloc.patchDownload
From 4ae1d6163d919a58a437e22c7cd9a5da350b4237 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Wed, 16 Nov 2022 12:48:47 +1300
Subject: [PATCH v2 2/2] Test code to exercise palloc_aligned and repalloc

---
 src/backend/optimizer/plan/planner.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 493a3af0fa..09e40567c7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -606,6 +606,21 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	RelOptInfo *final_rel;
 	ListCell   *l;
 
+	for (int align = 8; align <= 8192; align *= 2)
+	{
+		void *a = palloc0(512);
+		void *p = palloc_aligned(512, align, 0);
+		memset(p, 0, 512);
+
+		Assert(memcmp(p, a, 512) == 0);
+		p = repalloc(p, 1024);
+		Assert(memcmp(p, a, 512) == 0);
+		p = repalloc(p, 256);
+		Assert(memcmp(p, a, 256) == 0);
+		pfree(a);
+		pfree(p);
+	}
+
 	/* Create a PlannerInfo data structure for this subquery */
 	root = makeNode(PlannerInfo);
 	root->parse = parse;
-- 
2.35.1.windows.2

#18David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#17)
Re: Add palloc_aligned() to allow arbitrary power of 2 memory alignment

On Wed, 16 Nov 2022 at 23:56, David Rowley <dgrowleyml@gmail.com> wrote:

I've attached an updated patch. The 0002 is just intended to exercise
these allocations a little bit, it's not intended for commit. I was
using that to ensure valgrind does not complain about anything. It
seems happy now.

After making some comment adjustments and having adjusted the
calculation of how to get the old chunk size when doing repalloc() on
an aligned chunk, I've now pushed this.

Thank you for the reviews.

David