Ensuring hash tuples are properly maxaligned

Started by Tom Laneover 8 years ago8 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

I've been poking around in the PHJ code trying to identify the reason
why there are still so many buildfarm failures. I've not nailed it
down yet, but one thing I did notice is that there's an entirely
undocumented assumption that offsetof(HashMemoryChunkData, data) is
maxalign'ed. If it isn't, the allocation code will give back non-
maxaligned pointers, which will appear to work as long as you only
test on alignment-lax processors.

Now, the existing definition of the struct seems safe on all
architectures we support, but it would not take much to break it.
I think we ought to do what we did recently in the memory-context
code: insert an explicit padding calculation and a static assertion
that it's right. Hence, the attached proposed patch (in which
I also failed to resist the temptation to make the two code paths
in dense_alloc() look more alike). Any objections?

regards, tom lane

Attachments:

ensure-proper-hash-tuple-alignment.patchtext/x-diff; charset=us-ascii; name=ensure-proper-hash-tuple-alignment.patchDownload+33-23
#2Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#1)
Re: Ensuring hash tuples are properly maxaligned

Hi,

On 2018-01-02 19:08:49 -0500, Tom Lane wrote:

Now, the existing definition of the struct seems safe on all
architectures we support, but it would not take much to break it.
I think we ought to do what we did recently in the memory-context
code: insert an explicit padding calculation and a static assertion
that it's right. Hence, the attached proposed patch (in which
I also failed to resist the temptation to make the two code paths
in dense_alloc() look more alike). Any objections?

Generally no objections.

+	/*
+	 * It's required that data[] start at a maxaligned offset.  This padding
+	 * calculation is correct as long as int is not wider than size_t and
+	 * dsa_pointer is not wider than a regular pointer, but there's a static
+	 * assertion to check things in nodeHash.c.
+	 */

But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.

+#define HASHMEMORYCHUNKDATA_RAWSIZE (SIZEOF_SIZE_T * 3 + SIZEOF_VOID_P)
+
+	/* ensure proper alignment by adding padding if needed */
+#if (HASHMEMORYCHUNKDATA_RAWSIZE % MAXIMUM_ALIGNOF) != 0
+	char		padding[MAXIMUM_ALIGNOF - HASHMEMORYCHUNKDATA_RAWSIZE % MAXIMUM_ALIGNOF];
+#endif
+
char		data[FLEXIBLE_ARRAY_MEMBER];	/* buffer allocated at the end */
}			HashMemoryChunkData;

Wonder if this specific case wouldn't be easier handled with a union?

Greetings,

Andres Freund

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#2)
Re: Ensuring hash tuples are properly maxaligned

Andres Freund <andres@anarazel.de> writes:

On 2018-01-02 19:08:49 -0500, Tom Lane wrote:

Now, the existing definition of the struct seems safe on all
architectures we support, but it would not take much to break it.
I think we ought to do what we did recently in the memory-context
code: insert an explicit padding calculation and a static assertion
that it's right. Hence, the attached proposed patch (in which
I also failed to resist the temptation to make the two code paths
in dense_alloc() look more alike). Any objections?

But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.

Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.

The other idea I was considering was to get rid of the data[] member
entirely, define HASH_CHUNK_HEADER_SIZE as
MAXALIGN(sizeof(HashMemoryChunkData)), and replace the references to
data[] with pointer arithmetic along the lines of
((char *) chunk) + HASH_CHUNK_HEADER_SIZE
This would be a bit more invasive, but probably not too ugly if we
encapsulated that pointer arithmetic in a macro. And it'd certainly
be a bunch more robust against people throwing random fields into
the struct.

regards, tom lane

#4Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#3)
Re: Ensuring hash tuples are properly maxaligned

On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.

Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.

Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTER

--
Thomas Munro
http://www.enterprisedb.com

#5Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#4)
Re: Ensuring hash tuples are properly maxaligned

On 2018-01-03 14:29:15 +1300, Thomas Munro wrote:

On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.

Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.

Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTER

Yep, I was.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#5)
Re: Ensuring hash tuples are properly maxaligned

Andres Freund <andres@anarazel.de> writes:

On 2018-01-03 14:29:15 +1300, Thomas Munro wrote:

On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.

Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.

Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTER

Yep, I was.

OK, then there's not a live bug, but I'm a bit tempted to get rid of
the data[] member anyway. It's not clear to me now that keeping it
results in net cleaner code. Thoughts?

regards, tom lane

#7Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#6)
Re: Ensuring hash tuples are properly maxaligned

On 2018-01-02 20:40:50 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-01-03 14:29:15 +1300, Thomas Munro wrote:

On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.

Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.

Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTER

Yep, I was.

OK, then there's not a live bug, but I'm a bit tempted to get rid of
the data[] member anyway. It's not clear to me now that keeping it
results in net cleaner code. Thoughts?

I like that plan. I don't think the data field buys us anything, and I
personally in most cases find "fully manual" alignment code easier to
reason about than fiddling with padding fields.

Greetings,

Andres Freund

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#7)
Re: Ensuring hash tuples are properly maxaligned

Andres Freund <andres@anarazel.de> writes:

On 2018-01-02 20:40:50 -0500, Tom Lane wrote:

OK, then there's not a live bug, but I'm a bit tempted to get rid of
the data[] member anyway. It's not clear to me now that keeping it
results in net cleaner code. Thoughts?

I like that plan. I don't think the data field buys us anything, and I
personally in most cases find "fully manual" alignment code easier to
reason about than fiddling with padding fields.

Agreed --- I'll go do that.

regards, tom lane