Consistently use palloc_object() and palloc_array()

Started by David Geier3 months ago29 messages

geidav.pg@gmail.com

3 months ago

Hi hackers,

I've changed all code to use the "new" palloc_object(), palloc_array(),
palloc0_object(), palloc0_array, repalloc_array() and repalloc0_array()
macros. This makes the code more readable and more consistent.

The patch is pretty big but potential merge conflicts should be easy to
resolve. If preferred, I can also further split up the patch, e.g.
directory by directory or high impact files first.

The patch is passing "meson test" and I've additionally wrote a script
that parses the patch file and verifies that every two corresponding +
and - lines match (e.g. palloc0() replaced by palloc0_array() or
palloc0_object(), the same for palloc() and repalloc(), additionally
some checks to make sure the conversion to the _array() variant is
correct).

--
David Geier

Chao Li

li.evan.chao@gmail.com

3 months ago

In reply to: David Geier (#1)

Re: Consistently use palloc_object() and palloc_array()

On Nov 27, 2025, at 06:09, David Geier <geidav.pg@gmail.com> wrote:

Hi hackers,

I've changed all code to use the "new" palloc_object(), palloc_array(),
palloc0_object(), palloc0_array, repalloc_array() and repalloc0_array()
macros. This makes the code more readable and more consistent.

The patch is pretty big but potential merge conflicts should be easy to
resolve. If preferred, I can also further split up the patch, e.g.
directory by directory or high impact files first.

The patch is passing "meson test" and I've additionally wrote a script
that parses the patch file and verifies that every two corresponding +
and - lines match (e.g. palloc0() replaced by palloc0_array() or
palloc0_object(), the same for palloc() and repalloc(), additionally
some checks to make sure the conversion to the _array() variant is
correct).

--
David Geier<v1-0001-Consistently-use-palloc_object-and-palloc_array.patch>

This is a large patch, I just take a quick look, and found that:

```
-		*phoned_word = palloc(sizeof(char) * strlen(word) + 1);
+		*phoned_word = palloc_array(char, strlen(word) + 1);
```

And

```
-		params = (const char **) palloc(sizeof(char *));
+		params = palloc_object(const char *);
```

Applying palloc_array and palloc_object to char type doesn’t seem to improve anything.

Best reagards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Michael Paquier

michael@paquier.xyz

3 months ago

In reply to: David Geier (#1)

Re: Consistently use palloc_object() and palloc_array()

On Wed, Nov 26, 2025 at 11:09:31PM +0100, David Geier wrote:

I've changed all code to use the "new" palloc_object(), palloc_array(),
palloc0_object(), palloc0_array, repalloc_array() and repalloc0_array()
macros. This makes the code more readable and more consistent.

The patch is pretty big but potential merge conflicts should be easy to
resolve. If preferred, I can also further split up the patch, e.g.
directory by directory or high impact files first.

The backpatching extra-pain argument indeed comes into mind first when
it comes to such changes, but perhaps we should just bite the bullet
and encourage the new allocation styles across the tree, as you are
doing here. I'm not completely sure if it would make sense to split
things up, if we do I would do it on a subdirectory basis like to
suggest, perhaps, like contrib/, src/backend/executor/, etc. to
balance the blast damage. Did you use some kind of automation to find
all of these? If yes, what did you use?

The patch is passing "meson test" and I've additionally wrote a script
that parses the patch file and verifies that every two corresponding +
and - lines match (e.g. palloc0() replaced by palloc0_array() or
palloc0_object(), the same for palloc() and repalloc(), additionally
some checks to make sure the conversion to the _array() variant is
correct).

It may be an idea to share that as well, so as your checks could be
replicated rather than partially re-guessed.
--
Michael

Thomas Munro

thomas.munro@gmail.com

3 months ago

In reply to: David Geier (#1)

Re: Consistently use palloc_object() and palloc_array()

On Thu, Nov 27, 2025 at 11:10 AM David Geier <geidav.pg@gmail.com> wrote:

I've changed all code to use the "new" palloc_object(), palloc_array(),
palloc0_object(), palloc0_array, repalloc_array() and repalloc0_array()
macros. This makes the code more readable and more consistent.

I wondered about this in the context of special alignment
requirements[1]/messages/by-id/CA+hUKGLQUivg-NC7dHdbRAPmG0Hapg1gGnygM5KgDfDM2a_TMg@mail.gmail.com. palloc() aligns to MAXALIGN, which we artificially
constrain for various reasons that we can't easily change (at least
not without splitting on-disk MAXALIGN from allocation MAXALIGN, and
if we do that we'll waste more memory). That's less than
alignof(max_align_t) on common systems, so then we have to do some
weird stuff to handle __int128 that doesn't fit too well into modern
<stdalign.h> thinking and also disables optimal codegen.

This isn't a fully-baked thought, just a thought that occurred to me
while looking into that: If palloc_object(Int128AggState) were smart
enough to detect that alignof(T) > MAXALIGN and redirect to
palloc_aligned(sizeof(T), alignof(T), ...) at compile time, then
Int128AggState would naturally propagate the layout requirements of
its __int128 member, and we wouldn't need to do that weird stuff, and
it wouldn't be error-prone if usage of __int128 spreads to more
structs. That really only makes sense if we generalise
palloc_object() as a programming style and consider direct use of
palloc() to be a rarer low-level interface that triggers human
reviewers to think about alignment, I guess. I think you'd also want
a variant that can deal with structs ending in a flexible array
member, but that seems doable... palloc_flexible_object(T,
flexible_member, flexible_elements) or whatever. But I might also be
missing some parts of that puzzle, for example it wouldn't make sense
if __int128 is ever stored.

[1]: /messages/by-id/CA+hUKGLQUivg-NC7dHdbRAPmG0Hapg1gGnygM5KgDfDM2a_TMg@mail.gmail.com

Tom Lane

tgl@sss.pgh.pa.us

3 months ago

In reply to: Thomas Munro (#4)

Re: Consistently use palloc_object() and palloc_array()

Thomas Munro <thomas.munro@gmail.com> writes:

This isn't a fully-baked thought, just a thought that occurred to me
while looking into that: If palloc_object(Int128AggState) were smart
enough to detect that alignof(T) > MAXALIGN and redirect to
palloc_aligned(sizeof(T), alignof(T), ...) at compile time, then
Int128AggState would naturally propagate the layout requirements of
its __int128 member, and we wouldn't need to do that weird stuff, and
it wouldn't be error-prone if usage of __int128 spreads to more
structs. That really only makes sense if we generalise
palloc_object() as a programming style and consider direct use of
palloc() to be a rarer low-level interface that triggers human
reviewers to think about alignment, I guess.

Hmm ... I had the same doubts as Michael about whether this change
could possibly be worth the ensuing back-patching pain. But if
it leads to an improvement in type-safety, that'd be a reason to
take on the work.

regards, tom lane

Michael Paquier

michael@paquier.xyz

3 months ago

In reply to: Tom Lane (#5)

Re: Consistently use palloc_object() and palloc_array()

On Wed, Nov 26, 2025 at 10:25:12PM -0500, Tom Lane wrote:

Hmm ... I had the same doubts as Michael about whether this change
could possibly be worth the ensuing back-patching pain. But if
it leads to an improvement in type-safety, that'd be a reason to
take on the work.

Yeah, that sounds like a reason convincing enough to do the jump. I
didn't really feel strongly against the original proposal to begin
with as it improves the code style, FWIW.

The backpatching bits like these are always annoying, of course. Now,
these fancy palloc() routines exist since v16 so that would just mean
two years and two branches that would need special handling. Even
better, why not just backpatch into v15 and v14 the macros in palloc.h
and fe_memutils.h to avoid backpatch hazards due to some new code?
--
Michael

David Geier

geidav.pg@gmail.com

3 months ago

In reply to: Chao Li (#2)

Re: Consistently use palloc_object() and palloc_array()

Hi!

Thanks for taking a look.

On 27.11.2025 00:03, Chao Li wrote:

This is a large patch, I just take a quick look, and found that:
```
-		*phoned_word = palloc(sizeof(char) * strlen(word) + 1);
+		*phoned_word = palloc_array(char, strlen(word) + 1);
```
And
```
-		params = (const char **) palloc(sizeof(char *));
+		params = palloc_object(const char *);
```
Applying palloc_array and palloc_object to char type doesn’t seem to improve anything.

You mean because sizeof(char) is always 1 and hence we could instead
simply write:

*phoned_word = palloc(strlen(word) + 1);
params = palloc(1);

I think the _array and _object variants are more expressive and for sure
don't make the code less readable.

--
David Geier

Tom Lane

tgl@sss.pgh.pa.us

3 months ago

In reply to: David Geier (#7)

Re: Consistently use palloc_object() and palloc_array()

David Geier <geidav.pg@gmail.com> writes:

On 27.11.2025 00:03, Chao Li wrote:

This is a large patch, I just take a quick look, and found that:
-		*phoned_word = palloc(sizeof(char) * strlen(word) + 1);
+		*phoned_word = palloc_array(char, strlen(word) + 1);
And
-		params = (const char **) palloc(sizeof(char *));
+		params = palloc_object(const char *);
Applying palloc_array and palloc_object to char type doesn’t seem to improve anything.

You mean because sizeof(char) is always 1 and hence we could instead
simply write:
*phoned_word = palloc(strlen(word) + 1);
params = palloc(1);
I think the _array and _object variants are more expressive and for sure
don't make the code less readable.

Yeah, I agree these particular changes seem fine. When you're doing
address arithmetic for a memcpy or such, it may be fine to wire in an
assumption that sizeof(char) == 1, but I think doing that in other
contexts is not particularly good style.

Another thing to note is that the proposed patch effectively changes
the expression evaluation order:

-		*phoned_word = palloc((sizeof(char) * strlen(word)) + 1);
+		*phoned_word = palloc(sizeof(char) * (strlen(word) + 1));

Now, there's not actually any difference because sizeof(char) is 1,
but if it hypothetically weren't, the new version is likely more
correct. Presumably the +1 is meant to allow room for a trailing \0,
which is a char.

It'd be a good idea to review the patch to see if there are any
places where semantics are changed in a less benign fashion...

regards, tom lane

David Geier

geidav.pg@gmail.com

3 months ago

In reply to: Michael Paquier (#3)

Re: Consistently use palloc_object() and palloc_array()

Hi Michael!

On 27.11.2025 01:24, Michael Paquier wrote:

On Wed, Nov 26, 2025 at 11:09:31PM +0100, David Geier wrote:

I've changed all code to use the "new" palloc_object(), palloc_array(),
palloc0_object(), palloc0_array, repalloc_array() and repalloc0_array()
macros. This makes the code more readable and more consistent.

The patch is pretty big but potential merge conflicts should be easy to
resolve. If preferred, I can also further split up the patch, e.g.
directory by directory or high impact files first.

The backpatching extra-pain argument indeed comes into mind first when
it comes to such changes, but perhaps we should just bite the bullet
and encourage the new allocation styles across the tree, as you are
doing here. I'm not completely sure if it would make sense to split
things up, if we do I would do it on a subdirectory basis like to
suggest, perhaps, like contrib/, src/backend/executor/, etc. to
balance the blast damage. Did you use some kind of automation to find
all of these? If yes, what did you use?

I thought again about splitting up the patch. I'm not sure how useful
this really is. If a single committer takes this on, then it doesn't
really matter. He can apply the patch but then commit directory by
directory or in any other way he deems best. If we want to divide the
work among multiple committers splitting might be more useful.

Just tell me what you prefer and I'll provide the patch accordingly.

The patch is passing "meson test" and I've additionally wrote a script
that parses the patch file and verifies that every two corresponding +
and - lines match (e.g. palloc0() replaced by palloc0_array() or
palloc0_object(), the same for palloc() and repalloc(), additionally
some checks to make sure the conversion to the _array() variant is
correct).

It may be an idea to share that as well, so as your checks could be
replicated rather than partially re-guessed.

I've attached the script. You can run it via

python3 verify_palloc_pairs.py patch_file

Disclaimer: The script is a bit of a mess. It doesn't check repalloc()
and it still reports five conversions as erroneous while they're
actually correct. I checked them manually. I left it at that because the
vast majority of changes it processes correctly and all tests pass. The
repalloc() occurrences I also checked manually.

--
David Geier

#10

David Geier

geidav.pg@gmail.com

3 months ago

In reply to: Thomas Munro (#4)

Re: Consistently use palloc_object() and palloc_array()

Hi Thomas!

On 27.11.2025 03:53, Thomas Munro wrote:

On Thu, Nov 27, 2025 at 11:10 AM David Geier <geidav.pg@gmail.com> wrote:

I've changed all code to use the "new" palloc_object(), palloc_array(),
palloc0_object(), palloc0_array, repalloc_array() and repalloc0_array()
macros. This makes the code more readable and more consistent.

I wondered about this in the context of special alignment
requirements[1]. palloc() aligns to MAXALIGN, which we artificially
constrain for various reasons that we can't easily change (at least
not without splitting on-disk MAXALIGN from allocation MAXALIGN, and
if we do that we'll waste more memory). That's less than
alignof(max_align_t) on common systems, so then we have to do some
weird stuff to handle __int128 that doesn't fit too well into modern
<stdalign.h> thinking and also disables optimal codegen.

This isn't a fully-baked thought, just a thought that occurred to me
while looking into that: If palloc_object(Int128AggState) were smart
enough to detect that alignof(T) > MAXALIGN and redirect to
palloc_aligned(sizeof(T), alignof(T), ...) at compile time, then
Int128AggState would naturally propagate the layout requirements of
its __int128 member, and we wouldn't need to do that weird stuff, and
it wouldn't be error-prone if usage of __int128 spreads to more
structs. That really only makes sense if we generalise
palloc_object() as a programming style and consider direct use of
palloc() to be a rarer low-level interface that triggers human
reviewers to think about alignment, I guess. I think you'd also want
a variant that can deal with structs ending in a flexible array
member, but that seems doable... palloc_flexible_object(T,
flexible_member, flexible_elements) or whatever. But I might also be
missing some parts of that puzzle, for example it wouldn't make sense
if __int128 is ever stored.

[1] /messages/by-id/CA+hUKGLQUivg-NC7dHdbRAPmG0Hapg1gGnygM5KgDfDM2a_TMg@mail.gmail.com

These are some interesting ideas but I would consider them for now as
possible follow-up work, once this refactoring is merged.

--
David Geier

#11

David Geier

geidav.pg@gmail.com

3 months ago

In reply to: Tom Lane (#8)

Re: Consistently use palloc_object() and palloc_array()

On 28.11.2025 22:28, Tom Lane wrote:

David Geier <geidav.pg@gmail.com> writes:
On 27.11.2025 00:03, Chao Li wrote:
This is a large patch, I just take a quick look, and found that:
-		*phoned_word = palloc(sizeof(char) * strlen(word) + 1);
+		*phoned_word = palloc_array(char, strlen(word) + 1);
And
-		params = (const char **) palloc(sizeof(char *));
+		params = palloc_object(const char *);
Applying palloc_array and palloc_object to char type doesn’t seem to improve anything.
You mean because sizeof(char) is always 1 and hence we could instead
simply write:
*phoned_word = palloc(strlen(word) + 1);
params = palloc(1);
I think the _array and _object variants are more expressive and for sure
don't make the code less readable.

Yeah, I agree these particular changes seem fine. When you're doing
address arithmetic for a memcpy or such, it may be fine to wire in an
assumption that sizeof(char) == 1, but I think doing that in other
contexts is not particularly good style.

Another thing to note is that the proposed patch effectively changes
the expression evaluation order:
-		*phoned_word = palloc((sizeof(char) * strlen(word)) + 1);
+		*phoned_word = palloc(sizeof(char) * (strlen(word) + 1));
Now, there's not actually any difference because sizeof(char) is 1,
but if it hypothetically weren't, the new version is likely more
correct. Presumably the +1 is meant to allow room for a trailing \0,
which is a char.

Oh right. Good catch!

It'd be a good idea to review the patch to see if there are any
places where semantics are changed in a less benign fashion...

I intentionally tried to avoid any semantic changes but it's of course
possible something slipped through by accident.

It's some ~1700 occurrences in the patch. If it takes ~10 seconds to
check a single, it would take ~4.7h to review the entire patch.

If no one is willing to take this on, I could split up the patch in 4 to
5 that each can be reviewed by someone else.

--
David Geier

#12

Thomas Munro

thomas.munro@gmail.com

3 months ago

In reply to: David Geier (#11)

Re: Consistently use palloc_object() and palloc_array()

On Sat, Nov 29, 2025 at 10:47 AM David Geier <geidav.pg@gmail.com> wrote:

I intentionally tried to avoid any semantic changes but it's of course
possible something slipped through by accident.

Do you expect the generated code to be identical? Is it?

#13

David Geier

geidav.pg@gmail.com

3 months ago

In reply to: Thomas Munro (#12)

Re: Consistently use palloc_object() and palloc_array()

On 28.11.2025 23:31, Thomas Munro wrote:

On Sat, Nov 29, 2025 at 10:47 AM David Geier <geidav.pg@gmail.com> wrote:

I intentionally tried to avoid any semantic changes but it's of course
possible something slipped through by accident.

Do you expect the generated code to be identical? Is it?

In the majority of cases yes. However, there are a few cases where a
small change in the C code can yield to differences in the generated code:

I used the following bash script to create the disassembly of all object
files in the build directory. I ran this script twice, once on master
and once on the patched branch. The directory needs to be adapted
accordingly in the script.

find . -name "*.o" -print0 | while IFS= read -r -d '' file; do
mkdir -p ~/Desktop/master/"$(dirname "$file")"

if objdump -drwC -Mintel "$file" > ~/Desktop/master/"$file".dis
2>/dev/null; then
echo " ✓ Disassembled to ~/Desktop/master/$file.dis"
else
echo " ✗ Failed to disassemble $file"
fi
done

There are 54 files that show changes in the generated code. I didn't
look through all files but the changes are largely caused by the same
reasons:

1) If the refactoring introduces a change in the number of lines, the
__LINE__ macro used by elog.h will cause a change in the disassembly.
This is the reason for the majority of changes.

2) fuzzystrmatch.c: contains a functional change. We allocate 1 byte
more now but that is safe, see [1]/messages/by-id/524587.1764365323@sss.pgh.pa.us.

3) pg_buffercache_pages.c: Functional change. Type used in
palloc_array() changed from uint64 to int, as the rest of the code only
uses an integer.

4) trgm_op.c: Contains arithmetic expressions in the calls to
palloc_array() where now the brackets are placed differently, e.g.
sizeof(type) * a * b vs sizeof(type) * (a * b). That's the reason for
many differences.

So reviewing this patch can now be done by only going through all files
that have changes in the disassembly. This is only 54 out of which most
are because of changes in the number of LOC or where the brackets are
placed.

[1]: /messages/by-id/524587.1764365323@sss.pgh.pa.us

--
David Geier

#14

Michael Paquier

michael@paquier.xyz

3 months ago

In reply to: David Geier (#13)

Re: Consistently use palloc_object() and palloc_array()

On Tue, Dec 02, 2025 at 04:13:01PM +0100, David Geier wrote:

So reviewing this patch can now be done by only going through all files
that have changes in the disassembly. This is only 54 out of which most
are because of changes in the number of LOC or where the brackets are
placed.

It may be a good idea to split the patch into two parts, at least:
- One for the bulk of the changes, for the straight-forward changes.
Most of what you are suggesting are that for palloc_object and
palloc_array which are dropped-in replacements. Checking that these
assemble the same before and after offers one extra layer of
confidence.
- Second one for the more dubious changes.

It does not change that all these need to be looked with human eyes.
For the first one, splitting things based on the code area is simpler
With more than 1.7k places changed, splitting by area and checking
them individually would be the best course, at least for me when it
comes to such mechanical changes. It comes down with dealing with
individual doses that are not so large that they cause one's head to
spin in the middle of checking the diffs (did that a few times in the
past for this code tree, splitting and dose balance helps a lot).

I cannot say for the others, but I find the type-safety argument
mentioned upthread good enough to do a switch and encourage more the
new style moving forward.
--
Michael

#15

Tom Lane

tgl@sss.pgh.pa.us

3 months ago

In reply to: Michael Paquier (#14)

Re: Consistently use palloc_object() and palloc_array()

Michael Paquier <michael@paquier.xyz> writes:

It may be a good idea to split the patch into two parts, at least:
- One for the bulk of the changes, for the straight-forward changes.
Most of what you are suggesting are that for palloc_object and
palloc_array which are dropped-in replacements. Checking that these
assemble the same before and after offers one extra layer of
confidence.
- Second one for the more dubious changes.

Yeah, I was thinking the same. Some of those might perhaps be bugs
that we want to back-patch, so they need to be looked at in a
different way.

regards, tom lane

#16

Peter Eisentraut

peter_e@gmx.net

3 months ago

In reply to: Thomas Munro (#4)

Re: Consistently use palloc_object() and palloc_array()

On 27.11.25 03:53, Thomas Munro wrote:

I wondered about this in the context of special alignment
requirements[1]. palloc() aligns to MAXALIGN, which we artificially
constrain for various reasons that we can't easily change (at least
not without splitting on-disk MAXALIGN from allocation MAXALIGN, and
if we do that we'll waste more memory). That's less than
alignof(max_align_t) on common systems, so then we have to do some
weird stuff to handle __int128 that doesn't fit too well into modern
<stdalign.h> thinking and also disables optimal codegen.

On macOS ARM, I have MAXALIGN == alignof(max_align_t) == 8, but
alignof(__int128) == 16. (macOS Intel has 16/16.) Also, as a
consequence of that, the result of malloc() is not guaranteed to be
aligned sufficiently for __int128 (need to use aligned_alloc()). So it
seems to me that the current behavior of palloc() is pretty consistent
with that.

#17

David Geier

geidav.pg@gmail.com

3 months ago

In reply to: Tom Lane (#15)

Re: Consistently use palloc_object() and palloc_array()

On 03.12.2025 01:40, Tom Lane wrote:

Michael Paquier <michael@paquier.xyz> writes:

It may be a good idea to split the patch into two parts, at least:
- One for the bulk of the changes, for the straight-forward changes.
Most of what you are suggesting are that for palloc_object and
palloc_array which are dropped-in replacements. Checking that these
assemble the same before and after offers one extra layer of
confidence.
- Second one for the more dubious changes.

Yeah, I was thinking the same. Some of those might perhaps be bugs
that we want to back-patch, so they need to be looked at in a
different way.

regards, tom lane

Attached are the two patches, rebased on latest master.

The first one contains all changes that either result in no changes to
the disassembly, or only in changes due to the line counts; because
elog.h makes use of __LINE__.

The second patch contains the remaining changes, which is only 11 files.

contrib/fuzzystrmatch/fuzzystrmatch.c: semantic changes, e.g.
palloc(sizeof(char) * strlen(word) + 1);
=>
palloc_array(char, strlen(word) + 1);

contrib/pg_buffercache/pg_buffercache_pages.c: semantic change
palloc(sizeof(uint64) * os_page_count);
=>
palloc_array(int, os_page_count);

src/test/modules/test_rbtree/test_rbtree.c: different way to get pointer
return (RBTNode *) palloc(sizeof(IntRBTreeNode));
=>
return &palloc_object(IntRBTreeNode)->rbtnode;

contrib/hstore/hstore_gin.c
contrib/hstore/hstore_op.c
contrib/pg_trgm/trgm_op.c
contrib/postgres_fdw/postgres_fdw.c
src/backend/executor/execPartition.c
src/backend/partitioning/partprune.c
src/backend/statistics/mvdistinct.c
src/backend/storage/buffer/bufmgr.c
All files contain arithmetic expressions for the count argument of
palloc[0]_array(). The parentheses can change the order in which the
arguments are evaluated which changes the generated code.

--
David Geier

#18

Michael Paquier

michael@paquier.xyz

3 months ago

In reply to: David Geier (#17)

Re: Consistently use palloc_object() and palloc_array()

On Thu, Dec 04, 2025 at 10:31:41AM +0100, David Geier wrote:

Attached are the two patches, rebased on latest master.

The first one contains all changes that either result in no changes to
the disassembly, or only in changes due to the line counts; because
elog.h makes use of __LINE__.

Thanks. That's a lot to digest.

I have generated some reports by comparing builds of HEAD and
HEAD+0001 to indeed note that a lot of noise is caused by the line
number changes. And then, I have begun looking at contrib/ to start
with something as I had a bit of time today. One part which we need
to be careful about is that the allocations map with the actual
declarations, and I'm doubting my compiler to detect all of them, so
visual check for each path it is...

-   void       *out = palloc(sizeof(float4KEY));
+   void       *out = palloc_object(float4KEY);

btree_gist/ has a bunch of these, which I guess don't really matter in
terms of type safety.

-   sv = palloc(sizeof(bytea *) * (maxoff + 1));
+   sv = palloc_array(bytea *, (maxoff + 1));

This one also in gbt_var_picksplit@btree_utils_var.c. We have a
GBT_VARKEY.

+   keys = (char **) palloc_array(char *, key_count);
+   values = (char **) palloc_array(char *, val_count);

These two should not need casts.

-               stack = palloc(sizeof(*stack));
+               stack = palloc_object(sepgsql_proc_stack);

This one in sepgsql was wrong, that can be detected with a compilation
of the module.

And done with the contrib/ part for the trival changes.
--
Michael

#19

David Geier

geidav.pg@gmail.com

3 months ago

In reply to: Michael Paquier (#18)

Re: Consistently use palloc_object() and palloc_array()

On 05.12.2025 08:41, Michael Paquier wrote:

On Thu, Dec 04, 2025 at 10:31:41AM +0100, David Geier wrote:

Attached are the two patches, rebased on latest master.

The first one contains all changes that either result in no changes to
the disassembly, or only in changes due to the line counts; because
elog.h makes use of __LINE__.

Thanks. That's a lot to digest.

I have generated some reports by comparing builds of HEAD and
HEAD+0001 to indeed note that a lot of noise is caused by the line
number changes. And then, I have begun looking at contrib/ to start
with something as I had a bit of time today. One part which we need
to be careful about is that the allocations map with the actual
declarations, and I'm doubting my compiler to detect all of them, so
visual check for each path it is...
-   void       *out = palloc(sizeof(float4KEY));
+   void       *out = palloc_object(float4KEY);
btree_gist/ has a bunch of these, which I guess don't really matter in
terms of type safety.
-   sv = palloc(sizeof(bytea *) * (maxoff + 1));
+   sv = palloc_array(bytea *, (maxoff + 1));
This one also in gbt_var_picksplit@btree_utils_var.c. We have a
GBT_VARKEY.
+   keys = (char **) palloc_array(char *, key_count);
+   values = (char **) palloc_array(char *, val_count);
These two should not need casts.
-               stack = palloc(sizeof(*stack));
+               stack = palloc_object(sepgsql_proc_stack);
This one in sepgsql was wrong, that can be detected with a compilation
of the module.

My bad. I hadn't realized that - obviously - not necessarily all code is
actually compiled by default.

Will the build system enable any target for which all dependencies (e.g.
libraries) are met, or are there targets that need to be enabled
explicitly? Is there a way to enable all of them so I can easily make
sure, I actually compile all code?

And done with the contrib/ part for the trival changes.

Thanks for applying the first batch of changes.

--
David Geier

#20

Tom Lane

tgl@sss.pgh.pa.us

3 months ago

In reply to: David Geier (#19)

Re: Consistently use palloc_object() and palloc_array()

David Geier <geidav.pg@gmail.com> writes:

My bad. I hadn't realized that - obviously - not necessarily all code is
actually compiled by default.

Will the build system enable any target for which all dependencies (e.g.
libraries) are met, or are there targets that need to be enabled
explicitly? Is there a way to enable all of them so I can easily make
sure, I actually compile all code?

There's a good deal of platform-specific code in PG, particularly for
Windows, and obviously none of that will be built unless you're on the
right platform. (You can reach some of the Windows-specific code on
other systems with -DEXEC_BACKEND, but that goes only so far.)

As for optional-feature stuff, I think the meson build system will by
default build every option it can find the supporting libraries for.
But the other side of that coin is that if you didn't install the
right packages it will silently not build it.

I think the most reliable way is to look at "./configure --help"
and select all the options you think should work on your platform.
Then, if you forgot to install libfoo-devel or whatever, you'll
get configure failures.

regards, tom lane

#21

Michael Paquier

michael@paquier.xyz

3 months ago

In reply to: Michael Paquier (#18)

#22

Thomas Munro

thomas.munro@gmail.com

3 months ago

In reply to: Michael Paquier (#21)

#23