Remove useless casting to the same type

Started by Bertrand Drouvot5 months ago16 messageshackers
Jump to latest
#1Bertrand Drouvot
bertranddrouvot.pg@gmail.com

Hi hackers,

Attached is a patch to $SUBJECT.

This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
could cause risks of hiding actual type mismatches in the future or silently
discarding qualifiers. I think that it also improves readability.

Those have been found with a coccinelle script as:

"
@@
type T;
T E;
@@

- (T)
E
"

which removes casts when it's casting to the same type.

Note that it generated more that what is in the attached. I chose not to remove
some of them (mainly arithmetic ones) to keep the patch focused on what matters
here.

Thoughts?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Remove-useless-casting-to-same-type.patchtext/x-diff; charset=us-asciiDownload+80-81
#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Bertrand Drouvot (#1)
Re: Remove useless casting to the same type

On Mon, Nov 24, 2025 at 10:26:32AM +0000, Bertrand Drouvot wrote:

This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
could cause risks of hiding actual type mismatches in the future or silently
discarding qualifiers. I think that it also improves readability.

Seems reasonable to me.

Note that it generated more that what is in the attached. I chose not to remove
some of them (mainly arithmetic ones) to keep the patch focused on what matters
here.

Can you give an example of what you are talking about here?

--
nathan

#3Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Bertrand Drouvot (#1)
Re: Remove useless casting to the same type

On Mon, Nov 24, 2025 at 2:26 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Attached is a patch to $SUBJECT.

--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -607,11 +607,11 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
if (append)
elog(DEBUG2, "appended %d new items to block %u; %d bytes (%d to go)",
- maxitems, BufferGetBlockNumber(buf), (int) leaf->lsize,
+ maxitems, BufferGetBlockNumber(buf), leaf->lsize,
items->nitem - items->curitem - maxitems);

Hm. How do we feel, as a group, about superstitious casts in variadic
calls? I don't feel like they're in the same class as the other fixes.

Argument for: it's nice to know at a glance that a printf() invocation
won't corrupt something horribly, especially if I'm quickly scanning
code during a CVE analysis, and especially if the variable is named as
if it could maybe be a size_t. Do our compilers warn us well enough
now, in practice?

Argument against: it takes up precious columns and focuses attention
away from other things. Like the fact that (items->nitem -
items->curitem - maxitems) is unsigned and printed as signed here. :D
Maybe we should make the code compile cleanly under
-Wformat-signedness at some point... (not in this patch, not signing
you up for that)

@@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)

/* extract low and high masks. */
memcpy(&lowmask, data, sizeof(uint32));
- highmask = (uint32 *) ((char *) data + sizeof(uint32));
+ highmask = (uint32 *) (data + sizeof(uint32));

I wonder about these, too. I like knowing what the code does without
having to check the type of `data`. But then later on we do a `data +=
sizeof(uint32) * 2`, so you have to check the type anyway, so... I
don't know.

+++ b/contrib/fuzzystrmatch/dmetaphone.c
@@ -327,7 +327,7 @@ GetAt(metastring *s, int pos)
if ((pos < 0) || (pos >= s->length))
return '\0';
- return ((char) *(s->str + pos));
+ return *(s->str + pos);

Isn't this just s->str[pos]? Ditto for SetAt(), right afterwards.

--Jacob

#4Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Nathan Bossart (#2)
Re: Remove useless casting to the same type

Hi,

On Mon, Nov 24, 2025 at 12:43:28PM -0600, Nathan Bossart wrote:

On Mon, Nov 24, 2025 at 10:26:32AM +0000, Bertrand Drouvot wrote:

This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
could cause risks of hiding actual type mismatches in the future or silently
discarding qualifiers. I think that it also improves readability.

Seems reasonable to me.

Thanks for looking at it!

Note that it generated more that what is in the attached. I chose not to remove
some of them (mainly arithmetic ones) to keep the patch focused on what matters
here.

Can you give an example of what you are talking about here?

Things like:

A)

- int   k = (int) (targrows * sampler_random_fract(&rstate.randstate));
+ int   k = (targrows * sampler_random_fract(&rstate.randstate));

That's a valid cast removal but I'm not sure about the removal added value here.

B)
- sign = (BITVECP) (((char *) sign) + 1);
+ sign = ((sign) + 1);

BITVECP is "typedef char *BITVECP; So that's a valid cast removal but I decided
to keep it for semantic reason.

Same as:

@@ -2277,7 +2277,7 @@ printTrgmColor(StringInfo buf, TrgmColor co)
        else if (co == COLOR_BLANK)
                appendStringInfoChar(buf, 'b');
        else
-               appendStringInfo(buf, "%d", (int) co);
+               appendStringInfo(buf, "%d", co);

where TrgmColor is "typedef int TrgmColor;"

C)

-  if (((unsigned char *) base)[i] != 0xff)
+  if ((base)[i] != 0xff)

because not safe.

See attached the full list that I decided not to include.
Do you think we should add some of them in the patch? (maybe the ones in
nbtcompare.c for example)

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

cast_to_keep.txttext/plain; charset=us-asciiDownload+68-68
#5Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Jacob Champion (#3)
Re: Remove useless casting to the same type

Hi,

On Mon, Nov 24, 2025 at 03:18:00PM -0800, Jacob Champion wrote:

On Mon, Nov 24, 2025 at 2:26 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Attached is a patch to $SUBJECT.

--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -607,11 +607,11 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
if (append)
elog(DEBUG2, "appended %d new items to block %u; %d bytes (%d to go)",
- maxitems, BufferGetBlockNumber(buf), (int) leaf->lsize,
+ maxitems, BufferGetBlockNumber(buf), leaf->lsize,
items->nitem - items->curitem - maxitems);

Hm. How do we feel, as a group, about superstitious casts in variadic
calls? I don't feel like they're in the same class as the other fixes.

Argument for: it's nice to know at a glance that a printf() invocation
won't corrupt something horribly, especially if I'm quickly scanning
code during a CVE analysis, and especially if the variable is named as
if it could maybe be a size_t. Do our compilers warn us well enough
now, in practice?

Argument against: it takes up precious columns and focuses attention
away from other things.

Thanks for looking at it!

I think that the variadic calls in the patch are related to functions that
can benefits from -Wformat. Let's focus on those: with the cast one would need
to verify 3 things: variable type, cast and format specifier.
Without the cast then only 2 things and the compiler can verify these match via
-Wformat warnings.

With the cast, the compiler only checks that the cast result matches the format,
not whether the cast itself is correct, so I'm in favor of removing the cast,
thoughts?

Like the fact that (items->nitem -
items->curitem - maxitems) is unsigned and printed as signed here. :D

Nice catch! ;-)

Maybe we should make the code compile cleanly under
-Wformat-signedness at some point...

good idea, will give it a try later on outside the context of this patch.

@@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)

/* extract low and high masks. */
memcpy(&lowmask, data, sizeof(uint32));
- highmask = (uint32 *) ((char *) data + sizeof(uint32));
+ highmask = (uint32 *) (data + sizeof(uint32));

I wonder about these, too. I like knowing what the code does without
having to check the type of `data`. But then later on we do a `data +=
sizeof(uint32) * 2`, so you have to check the type anyway, so... I
don't know.

I think that even with the cast in place, it's good to check the type of data.
Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
that the cast makes sense and does not hide "wrong" pointer manipulation.

So I think that with or without the cast one would need to check. But that feels
more natural to check when there is no cast (as we don't assume that someone
said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?

+++ b/contrib/fuzzystrmatch/dmetaphone.c
@@ -327,7 +327,7 @@ GetAt(metastring *s, int pos)
if ((pos < 0) || (pos >= s->length))
return '\0';
- return ((char) *(s->str + pos));
+ return *(s->str + pos);

Isn't this just s->str[pos]? Ditto for SetAt(), right afterwards.

Yeah, "*(s->str + pos)" is already used in SetAt() and also in IsVowel(). Instead
of changing those 3, I'd prefer to keep the current change and keep the patch
focus on its intend. We could change those in a dedicated patch afterward if we
feel the need.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#6David Geier
geidav.pg@gmail.com
In reply to: Bertrand Drouvot (#5)
Re: Remove useless casting to the same type

On 25.11.2025 06:46, Bertrand Drouvot wrote:

I think that the variadic calls in the patch are related to functions that
can benefits from -Wformat. Let's focus on those: with the cast one would need
to verify 3 things: variable type, cast and format specifier.
Without the cast then only 2 things and the compiler can verify these match via
-Wformat warnings.

With the cast, the compiler only checks that the cast result matches the format,
not whether the cast itself is correct, so I'm in favor of removing the cast,
thoughts?

I agree. It's better if we only have the casts in places where we
actually want to change the type before printing.

Another benefit is that one can directly deduct the type from the log /
print statement by looking at the format specifier.

--
David Geier

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Bertrand Drouvot (#5)
Re: Remove useless casting to the same type

On 25.11.25 06:46, Bertrand Drouvot wrote:

@@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)

/* extract low and high masks. */
memcpy(&lowmask, data, sizeof(uint32));
- highmask = (uint32 *) ((char *) data + sizeof(uint32));
+ highmask = (uint32 *) (data + sizeof(uint32));

I wonder about these, too. I like knowing what the code does without
having to check the type of `data`. But then later on we do a `data +=
sizeof(uint32) * 2`, so you have to check the type anyway, so... I
don't know.

I think that even with the cast in place, it's good to check the type of data.
Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
that the cast makes sense and does not hide "wrong" pointer manipulation.

So I think that with or without the cast one would need to check. But that feels
more natural to check when there is no cast (as we don't assume that someone
said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?

I think this whole thing could be simplified by overlaying a uint32 over
"data" and just accessing the array fields normally. See attached patch.

Attachments:

0001-Simplify-hash_xlog_split_allocate_page.patch.nocfbottext/plain; charset=UTF-8; name=0001-Simplify-hash_xlog_split_allocate_page.patch.nocfbotDownload+10-19
#8Peter Eisentraut
peter_e@gmx.net
In reply to: Bertrand Drouvot (#5)
Re: Remove useless casting to the same type

On 25.11.25 06:46, Bertrand Drouvot wrote:

Hm. How do we feel, as a group, about superstitious casts in variadic
calls? I don't feel like they're in the same class as the other fixes.

Argument for: it's nice to know at a glance that a printf() invocation
won't corrupt something horribly, especially if I'm quickly scanning
code during a CVE analysis, and especially if the variable is named as
if it could maybe be a size_t. Do our compilers warn us well enough
now, in practice?

Argument against: it takes up precious columns and focuses attention
away from other things.

Thanks for looking at it!

I think that the variadic calls in the patch are related to functions that
can benefits from -Wformat. Let's focus on those: with the cast one would need
to verify 3 things: variable type, cast and format specifier.
Without the cast then only 2 things and the compiler can verify these match via
-Wformat warnings.

With the cast, the compiler only checks that the cast result matches the format,
not whether the cast itself is correct, so I'm in favor of removing the cast,
thoughts?

Yes, I think with -Wformat you get approximately the same level of
protection as with a prototype.

Like the fact that (items->nitem -
items->curitem - maxitems) is unsigned and printed as signed here. :D

Nice catch! ;-)

Maybe we should make the code compile cleanly under
-Wformat-signedness at some point...

That would be similar to using -Wsign-conversion with function
prototypes. Maybe a good idea, but we don't use it, and so we shouldn't
expect it for non-prototype invocations either.

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Bertrand Drouvot (#5)
Re: Remove useless casting to the same type

On 25.11.25 06:46, Bertrand Drouvot wrote:

Maybe we should make the code compile cleanly under
-Wformat-signedness at some point...

good idea, will give it a try later on outside the context of this patch.

I test this once in a while and fix the issues that I find. But it's
very picky and you will find difficult to fix problems like the fact
that the signedness of enums is implementation-defined, and so the only
portable fix there would be to add more casts.

I think it could be useful to tighten the source code with respect to
implicit integer conversions, using warnings such as -Wsign-conversion
and -Wconversion as well as -Wformat-signedness. There are surely
hidden overflow or truncation issues similar to CVE-2025-12818 hidden
somewhere. But explicit casts defeat those warnings, so removing
unnecessary casts is a good step on the way there.

#10Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Eisentraut (#7)
Re: Remove useless casting to the same type

Hi,

On Fri, Nov 28, 2025 at 09:11:16AM +0100, Peter Eisentraut wrote:

On 25.11.25 06:46, Bertrand Drouvot wrote:

@@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)

/* extract low and high masks. */
memcpy(&lowmask, data, sizeof(uint32));
- highmask = (uint32 *) ((char *) data + sizeof(uint32));
+ highmask = (uint32 *) (data + sizeof(uint32));

I wonder about these, too. I like knowing what the code does without
having to check the type of `data`. But then later on we do a `data +=
sizeof(uint32) * 2`, so you have to check the type anyway, so... I
don't know.

I think that even with the cast in place, it's good to check the type of data.
Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
that the cast makes sense and does not hide "wrong" pointer manipulation.

So I think that with or without the cast one would need to check. But that feels
more natural to check when there is no cast (as we don't assume that someone
said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?

I think this whole thing could be simplified by overlaying a uint32 over
"data" and just accessing the array fields normally. See attached patch.

Indeed, that's a nice simplification.

- data += sizeof(uint32) * 2;

Is it safe? I mean could XLH_SPLIT_META_UPDATE_MASKS and XLH_SPLIT_META_UPDATE_SPLITPOINT
be set simultaneously?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Bertrand Drouvot (#10)
Re: Remove useless casting to the same type

On 28.11.25 10:06, Bertrand Drouvot wrote:

Hi,

On Fri, Nov 28, 2025 at 09:11:16AM +0100, Peter Eisentraut wrote:

On 25.11.25 06:46, Bertrand Drouvot wrote:

@@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)

/* extract low and high masks. */
memcpy(&lowmask, data, sizeof(uint32));
- highmask = (uint32 *) ((char *) data + sizeof(uint32));
+ highmask = (uint32 *) (data + sizeof(uint32));

I wonder about these, too. I like knowing what the code does without
having to check the type of `data`. But then later on we do a `data +=
sizeof(uint32) * 2`, so you have to check the type anyway, so... I
don't know.

I think that even with the cast in place, it's good to check the type of data.
Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
that the cast makes sense and does not hide "wrong" pointer manipulation.

So I think that with or without the cast one would need to check. But that feels
more natural to check when there is no cast (as we don't assume that someone
said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?

I think this whole thing could be simplified by overlaying a uint32 over
"data" and just accessing the array fields normally. See attached patch.

Indeed, that's a nice simplification.

- data += sizeof(uint32) * 2;

Is it safe? I mean could XLH_SPLIT_META_UPDATE_MASKS and XLH_SPLIT_META_UPDATE_SPLITPOINT
be set simultaneously?

Yes, that's what was probably intended. But apparently not exercised in
the tests.

So maybe more like this patch.

Attachments:

v2-0001-Simplify-hash_xlog_split_allocate_page.patch.nocfbottext/plain; charset=UTF-8; name=v2-0001-Simplify-hash_xlog_split_allocate_page.patch.nocfbotDownload+12-19
#12Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Eisentraut (#11)
Re: Remove useless casting to the same type

Hi,

On Fri, Nov 28, 2025 at 02:20:25PM +0100, Peter Eisentraut wrote:

On 28.11.25 10:06, Bertrand Drouvot wrote:

On Fri, Nov 28, 2025 at 09:11:16AM +0100, Peter Eisentraut wrote:

I think this whole thing could be simplified by overlaying a uint32 over
"data" and just accessing the array fields normally. See attached patch.

Indeed, that's a nice simplification.

- data += sizeof(uint32) * 2;

Is it safe? I mean could XLH_SPLIT_META_UPDATE_MASKS and XLH_SPLIT_META_UPDATE_SPLITPOINT
be set simultaneously?

Yes, that's what was probably intended. But apparently not exercised in the
tests.

So maybe more like this patch.

+   uint32          lowmask = uidata[uidatacount++];
+   uint32          highmask = uidata[uidatacount++];

good idea! That way that's easier to add more branches/flags later if we need
to.

Also, I think that's safe thanks to XLogRecGetBlockData() returning a
MAXALIGNed buffer. Not sure if it's worth to add a comment. I think that a
comment was not needed with the original code as it was using memcpy() instead.

Except for the nit comment remark above, LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#13Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Bertrand Drouvot (#5)
Re: Remove useless casting to the same type

On Mon, Nov 24, 2025 at 9:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I think that even with the cast in place, it's good to check the type of data.
Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
that the cast makes sense and does not hide "wrong" pointer manipulation.

From a specifier standpoint ("accidentally got rid of const"), yes.
From a pointer math standpoint, `(char *) ptr + offset` has a specific
meaning to me, and I was arguing that hiding that meaning isn't
necessarily useful in all cases.

But! Peter's followup makes this moot. Which is good, and it might
serve as its own blueprint for when we next see a situation like this.

Yeah, "*(s->str + pos)" is already used in SetAt() and also in IsVowel(). Instead
of changing those 3, I'd prefer to keep the current change and keep the patch
focus on its intend. We could change those in a dedicated patch afterward if we
feel the need.

Fine by me. Thanks for this cleanup work!

--Jacob

#14Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Peter Eisentraut (#9)
Re: Remove useless casting to the same type

On Fri, Nov 28, 2025 at 12:41 AM Peter Eisentraut <peter@eisentraut.org> wrote:

I test this once in a while and fix the issues that I find. But it's
very picky and you will find difficult to fix problems like the fact
that the signedness of enums is implementation-defined, and so the only
portable fix there would be to add more casts.

Once you've gotten into that situation, don't you have potential
sign-extension issues during int promotion, which require casts
anyway? It might be nice to fix those up regardless (he said, from his
armchair). But that doesn't seem as pressing as the other potential
problems.

I think it could be useful to tighten the source code with respect to
implicit integer conversions, using warnings such as -Wsign-conversion
and -Wconversion as well as -Wformat-signedness. There are surely
hidden overflow or truncation issues similar to CVE-2025-12818 hidden
somewhere. But explicit casts defeat those warnings, so removing
unnecessary casts is a good step on the way there.

+1. (And your v2 looks good to me.)

--Jacob

#15Peter Eisentraut
peter_e@gmx.net
In reply to: Bertrand Drouvot (#1)
Re: Remove useless casting to the same type

On 24.11.25 11:26, Bertrand Drouvot wrote:

Hi hackers,

Attached is a patch to $SUBJECT.

This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
could cause risks of hiding actual type mismatches in the future or silently
discarding qualifiers. I think that it also improves readability.

Those have been found with a coccinelle script as:

"
@@
type T;
T E;
@@

- (T)
E
"

which removes casts when it's casting to the same type.

Note that it generated more that what is in the attached. I chose not to remove
some of them (mainly arithmetic ones) to keep the patch focused on what matters
here.

Committed. I made some cosmetic adjustments, to remove line breaks that
are no longer necessary with the shortened lines, and in gbt_num_union()
I kept &out[0] (instead of just out) for consistency with the following
line.

#16Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Jacob Champion (#13)
Re: Remove useless casting to the same type

Hi,

On Mon, Dec 01, 2025 at 09:50:43AM -0800, Jacob Champion wrote:

Fine by me.

Thanks for the review!

Thanks for this cleanup work!

That's fun and useful to do!

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com