Ensuring hash tuples are properly maxaligned
I've been poking around in the PHJ code trying to identify the reason
why there are still so many buildfarm failures. I've not nailed it
down yet, but one thing I did notice is that there's an entirely
undocumented assumption that offsetof(HashMemoryChunkData, data) is
maxalign'ed. If it isn't, the allocation code will give back non-
maxaligned pointers, which will appear to work as long as you only
test on alignment-lax processors.
Now, the existing definition of the struct seems safe on all
architectures we support, but it would not take much to break it.
I think we ought to do what we did recently in the memory-context
code: insert an explicit padding calculation and a static assertion
that it's right. Hence, the attached proposed patch (in which
I also failed to resist the temptation to make the two code paths
in dense_alloc() look more alike). Any objections?
regards, tom lane
Attachments:
ensure-proper-hash-tuple-alignment.patchtext/x-diff; charset=us-ascii; name=ensure-proper-hash-tuple-alignment.patchDownload+33-23
Hi,
On 2018-01-02 19:08:49 -0500, Tom Lane wrote:
Now, the existing definition of the struct seems safe on all
architectures we support, but it would not take much to break it.
I think we ought to do what we did recently in the memory-context
code: insert an explicit padding calculation and a static assertion
that it's right. Hence, the attached proposed patch (in which
I also failed to resist the temptation to make the two code paths
in dense_alloc() look more alike). Any objections?
Generally no objections.
+ /* + * It's required that data[] start at a maxaligned offset. This padding + * calculation is correct as long as int is not wider than size_t and + * dsa_pointer is not wider than a regular pointer, but there's a static + * assertion to check things in nodeHash.c. + */
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.
+#define HASHMEMORYCHUNKDATA_RAWSIZE (SIZEOF_SIZE_T * 3 + SIZEOF_VOID_P) + + /* ensure proper alignment by adding padding if needed */ +#if (HASHMEMORYCHUNKDATA_RAWSIZE % MAXIMUM_ALIGNOF) != 0 + char padding[MAXIMUM_ALIGNOF - HASHMEMORYCHUNKDATA_RAWSIZE % MAXIMUM_ALIGNOF]; +#endif + char data[FLEXIBLE_ARRAY_MEMBER]; /* buffer allocated at the end */ } HashMemoryChunkData;
Wonder if this specific case wouldn't be easier handled with a union?
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-01-02 19:08:49 -0500, Tom Lane wrote:
Now, the existing definition of the struct seems safe on all
architectures we support, but it would not take much to break it.
I think we ought to do what we did recently in the memory-context
code: insert an explicit padding calculation and a static assertion
that it's right. Hence, the attached proposed patch (in which
I also failed to resist the temptation to make the two code paths
in dense_alloc() look more alike). Any objections?
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.
Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.
The other idea I was considering was to get rid of the data[] member
entirely, define HASH_CHUNK_HEADER_SIZE as
MAXALIGN(sizeof(HashMemoryChunkData)), and replace the references to
data[] with pointer arithmetic along the lines of
((char *) chunk) + HASH_CHUNK_HEADER_SIZE
This would be a bit more invasive, but probably not too ugly if we
encapsulated that pointer arithmetic in a macro. And it'd certainly
be a bunch more robust against people throwing random fields into
the struct.
regards, tom lane
On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.
Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTER
--
Thomas Munro
http://www.enterprisedb.com
On 2018-01-03 14:29:15 +1300, Thomas Munro wrote:
On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTER
Yep, I was.
Andres Freund <andres@anarazel.de> writes:
On 2018-01-03 14:29:15 +1300, Thomas Munro wrote:
On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.
Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.
Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTER
Yep, I was.
OK, then there's not a live bug, but I'm a bit tempted to get rid of
the data[] member anyway. It's not clear to me now that keeping it
results in net cleaner code. Thoughts?
regards, tom lane
On 2018-01-02 20:40:50 -0500, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2018-01-03 14:29:15 +1300, Thomas Munro wrote:
On Wed, Jan 3, 2018 at 2:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
But note that dsa_pointer can be wider than a regular pointer on
platforms without atomics support.Hm. I did not get that impression from the comments in dsa.h,
but if it's true then this approach won't work --- and indeed the
hash code would be actively broken in such a case, so it's a problem
we must fix.Maybe Andres is thinking of dsa_pointer_atomic? dsa_pointer is
normally the size of a pointer (well, really, the size of size_t),
though it could be *narrower* if you don't have atomics or ask for it
with USE_SMALL_DSA_POINTERYep, I was.
OK, then there's not a live bug, but I'm a bit tempted to get rid of
the data[] member anyway. It's not clear to me now that keeping it
results in net cleaner code. Thoughts?
I like that plan. I don't think the data field buys us anything, and I
personally in most cases find "fully manual" alignment code easier to
reason about than fiddling with padding fields.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-01-02 20:40:50 -0500, Tom Lane wrote:
OK, then there's not a live bug, but I'm a bit tempted to get rid of
the data[] member anyway. It's not clear to me now that keeping it
results in net cleaner code. Thoughts?
I like that plan. I don't think the data field buys us anything, and I
personally in most cases find "fully manual" alignment code easier to
reason about than fiddling with padding fields.
Agreed --- I'll go do that.
regards, tom lane