backend *.c #include cleanup (IWYU)
I played with include-what-you-use (IWYU), "a tool for use with clang to
analyze #includes in C and C++ source files".[0]https://include-what-you-use.org/ I came across this via
clangd (the language server), because clangd (via the editor) kept
suggesting a bunch of #includes to remove. And I suppose it was right.
So as a test, I ran IWYU over the backend *.c files and removed all the
#includes it suggested. (Note that IWYU also suggests to *add* a bunch
of #includes, in fact that is its main purpose; I didn't do this here.)
In some cases, a more specific #include replaces another less specific
one. (To keep the patch from exploding in size, I ignored for now all
the suggestions to replace catalog/pg_somecatalog.h with
catalog/pg_somecatalog_d.h.) This ended up with the attached patch,
which has
432 files changed, 233 insertions(+), 1023 deletions(-)
I tested this with various compilation options (assert, WAL_DEBUG,
LOCK_DEBUG, different geqo variants, etc.) to make sure a header wasn't
just used for some conditional code section. Also, again, this patch
touches just *.c files, so nothing declared from header files changes in
hidden ways.
Also, as a small example, in src/backend/access/transam/rmgr.c you'll
see some IWYU pragma annotations to handle a special case there.
The purpose of this patch is twofold: One, it's of course a nice
cleanup. Two, this is a test how well IWYU might work for us. If we
find either by human interpretation that a change doesn't make sense, or
something breaks on some platform, then that would be useful feedback
(perhaps to addressed by more pragma annotations or more test coverage).
(Interestingly, IWYU has been mentioned in src/tools/pginclude/README
since 2012. Has anyone else played with it? Was it not mature enough
back then?)
Attachments:
0001-Remove-unused-include-s-from-backend-.c-files.patchtext/plain; charset=UTF-8; name=0001-Remove-unused-include-s-from-backend-.c-files.patchDownload+233-1024
On Sat, Feb 10, 2024 at 08:40:43AM +0100, Peter Eisentraut wrote:
The purpose of this patch is twofold: One, it's of course a nice cleanup.
Two, this is a test how well IWYU might work for us. If we find either by
human interpretation that a change doesn't make sense, or something breaks
on some platform, then that would be useful feedback (perhaps to addressed
by more pragma annotations or more test coverage).(Interestingly, IWYU has been mentioned in src/tools/pginclude/README since
2012. Has anyone else played with it? Was it not mature enough back then?)
I haven't played with it at all, but the topic reminds me of this thread:
/messages/by-id/flat/CALDaNm1B9naPDTm3ox1m_yZvOm3KA5S4kZQSWWAeLHAQ=3gV1Q@mail.gmail.com
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 10.02.24 21:13, Nathan Bossart wrote:
(Interestingly, IWYU has been mentioned in src/tools/pginclude/README since
2012. Has anyone else played with it? Was it not mature enough back then?)I haven't played with it at all, but the topic reminds me of this thread:
/messages/by-id/flat/CALDaNm1B9naPDTm3ox1m_yZvOm3KA5S4kZQSWWAeLHAQ=3gV1Q@mail.gmail.com
Approaches like that as well as the in-tree pgrminclude work by "I
removed the #include and it still compiled fine", which can be
unreliable. IWYU on the other hand has the compiler tracking where a
symbol actually came from, and so if it says that an #include is not
used, then it's pretty much correct by definition.
On Mon, Feb 12, 2024 at 05:08:40PM +0100, Peter Eisentraut wrote:
On 10.02.24 21:13, Nathan Bossart wrote:
I haven't played with it at all, but the topic reminds me of this thread:
/messages/by-id/flat/CALDaNm1B9naPDTm3ox1m_yZvOm3KA5S4kZQSWWAeLHAQ=3gV1Q@mail.gmail.com
Approaches like that as well as the in-tree pgrminclude work by "I removed
the #include and it still compiled fine", which can be unreliable. IWYU on
the other hand has the compiler tracking where a symbol actually came from,
and so if it says that an #include is not used, then it's pretty much
correct by definition.
Okay. FWIW I tend to agree that this is nice cleanup. I'd personally run
this before every commit on HEAD if there was an easy way to do so and it
didn't cause changes to a bunch of unrelated files, but I understand that
the pgindent stuff in the buildfarm isn't super popular. I'd even like to
have a tool that automatically adjusted #include ordering to match project
policy, assuming one does not already exist.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Sat Feb 10, 2024 at 1:40 AM CST, Peter Eisentraut wrote:
I played with include-what-you-use (IWYU), "a tool for use with clang to
analyze #includes in C and C++ source files".[0] I came across this via
clangd (the language server), because clangd (via the editor) kept
suggesting a bunch of #includes to remove. And I suppose it was right.So as a test, I ran IWYU over the backend *.c files and removed all the
#includes it suggested. (Note that IWYU also suggests to *add* a bunch
of #includes, in fact that is its main purpose; I didn't do this here.)
In some cases, a more specific #include replaces another less specific
one. (To keep the patch from exploding in size, I ignored for now all
the suggestions to replace catalog/pg_somecatalog.h with
catalog/pg_somecatalog_d.h.) This ended up with the attached patch,
which has432 files changed, 233 insertions(+), 1023 deletions(-)
I tested this with various compilation options (assert, WAL_DEBUG,
LOCK_DEBUG, different geqo variants, etc.) to make sure a header wasn't
just used for some conditional code section. Also, again, this patch
touches just *.c files, so nothing declared from header files changes in
hidden ways.Also, as a small example, in src/backend/access/transam/rmgr.c you'll
see some IWYU pragma annotations to handle a special case there.The purpose of this patch is twofold: One, it's of course a nice
cleanup. Two, this is a test how well IWYU might work for us. If we
find either by human interpretation that a change doesn't make sense, or
something breaks on some platform, then that would be useful feedback
(perhaps to addressed by more pragma annotations or more test coverage).(Interestingly, IWYU has been mentioned in src/tools/pginclude/README
since 2012. Has anyone else played with it? Was it not mature enough
back then?)
I like this idea. This was something I tried to do but never finished in
my last project. I have also been noticing the same issues from clangd.
Having more automation around include patterns would be great! I think
it would be good if there a Meson run_target()/Make .PHONY target that
people could use to test too. Then, eventually the cfbot could pick this
up.
--
Tristan Partin
Neon (https://neon.tech)
Hi,
On 2024-02-10 08:40:43 +0100, Peter Eisentraut wrote:
So as a test, I ran IWYU over the backend *.c files and removed all the
#includes it suggested. (Note that IWYU also suggests to *add* a bunch of
#includes, in fact that is its main purpose; I didn't do this here.) In some
cases, a more specific #include replaces another less specific one. (To
keep the patch from exploding in size, I ignored for now all the suggestions
to replace catalog/pg_somecatalog.h with catalog/pg_somecatalog_d.h.) This
ended up with the attached patch, which has
I think this kind of suggested change is why I have been wary of IWYU (and the
changes that clangd suggest): By moving indirect includes to .c files you end
up with implementation details more widely, which can increase the pain of
refactoring.
I'd be hesitant to commit to this without a) a policy about adding annotations
about when indirect includes shouldn't directly be included b) annotations
ensuring that.
What were the parameters you used for IWYU? I'm e.g. a bit surprised about the
changes to main.c, adding an include for s_lock.h. For one I don't think
s_lock.h should ever be included directly, but more importantly it seems there
isn't a reason to include spin.h either, but it's not removed here?
There are other changes I don't understand. E.g. why is
catalog/binary_upgrade.h removed from pg_enum.c? It's actually defining
binary_upgrade_next_pg_enum_oid, declared in catalog/binary_upgrade.h?
I think some of the changes here could be applied independently of more iwyu
infrastructure, e.g. replacing includes of builtins.h with includes of
fmgrprotos.h. But the larger set of changes seems somewhat hard to judge
as-is.
FWIW, for me the tree with the patch applied doesn't build anymore, presumably
due to 8be93177c46.
../../../../../home/andres/src/postgresql/src/backend/commands/event_trigger.c: In function 'EventTriggerOnLogin':
../../../../../home/andres/src/postgresql/src/backend/commands/event_trigger.c:955:72: error: 'F_OIDEQ' undeclared (first use in this function)
955 | BTEqualStrategyNumber, F_OIDEQ,
| ^~~~~~~
../../../../../home/andres/src/postgresql/src/backend/commands/event_trigger.c:955:72: note: each undeclared identifier is reported only once for each function it appears in
Greetings,
Andres Freund
On 2024-Feb-10, Peter Eisentraut wrote:
So as a test, I ran IWYU over the backend *.c files and removed all the
#includes it suggested. (Note that IWYU also suggests to *add* a bunch of
#includes, in fact that is its main purpose; I didn't do this here.) In some
cases, a more specific #include replaces another less specific one.
Sounds reasonable in principle. I looked at the access/brin changes and
they seems OK. Then I noticed the execdebug.h -> executor.h changes and
was surprised, until I looked at execdebug.h and though ... maybe we
should just get rid of that file altogether.
Also, as a small example, in src/backend/access/transam/rmgr.c you'll see
some IWYU pragma annotations to handle a special case there.
Looks pretty acceptable.
The purpose of this patch is twofold: One, it's of course a nice cleanup.
Two, this is a test how well IWYU might work for us. If we find either by
human interpretation that a change doesn't make sense, or something breaks
on some platform, then that would be useful feedback (perhaps to addressed
by more pragma annotations or more test coverage).
Apparently the tool has long standing, so since the results appear to be
good, I'm not opposed to adding it to our workflow. Not as much as
pgindent for sure, though.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"All rings of power are equal,
But some rings of power are more equal than others."
(George Orwell's The Lord of the Rings)
Peter Eisentraut <peter@eisentraut.org> writes:
Approaches like that as well as the in-tree pgrminclude work by "I
removed the #include and it still compiled fine", which can be
unreliable. IWYU on the other hand has the compiler tracking where a
symbol actually came from, and so if it says that an #include is not
used, then it's pretty much correct by definition.
Well, it might be correct by definition for the version of the code
that the compiler processed. But it sounds to me like it's just as
vulnerable as pgrminclude to taking out #includes that are needed
only by #ifdef'd code sections that you didn't compile.
On the whole, our experience with automated #include removal is
pretty awful. I'm not sure I want to go there again.
regards, tom lane
Hi,
On 2024-02-12 16:46:55 -0500, Tom Lane wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
Approaches like that as well as the in-tree pgrminclude work by "I
removed the #include and it still compiled fine", which can be
unreliable. IWYU on the other hand has the compiler tracking where a
symbol actually came from, and so if it says that an #include is not
used, then it's pretty much correct by definition.Well, it might be correct by definition for the version of the code
that the compiler processed. But it sounds to me like it's just as
vulnerable as pgrminclude to taking out #includes that are needed
only by #ifdef'd code sections that you didn't compile.
I think pgrminclude is far worse than IWYU - it *maximizes* reliance on
indirect includes, the opposite of what iwyu does. I share concerns about
removing includes for platform/config option specific code, but I think that
it'd not take too many annotations to address that.
I think the proposed patch shows some good changes that are painful to do by
hand, but easy with iwyu, like replacing builtins.h with fmgrprotos.h,
replacing includes of heapam.h/heap.h with table.h etc where appropriate.
While I can see applying some targeted changes without more work, I don't
really see much point in applying a lot of the other removals without actually
committing to adding the necessary IWYU annotations to our code to make iwyu
actually usable.
Greetings,
Andres Freund
On 12.02.24 21:07, Andres Freund wrote:
On 2024-02-10 08:40:43 +0100, Peter Eisentraut wrote:
So as a test, I ran IWYU over the backend *.c files and removed all the
#includes it suggested. (Note that IWYU also suggests to *add* a bunch of
#includes, in fact that is its main purpose; I didn't do this here.) In some
cases, a more specific #include replaces another less specific one. (To
keep the patch from exploding in size, I ignored for now all the suggestions
to replace catalog/pg_somecatalog.h with catalog/pg_somecatalog_d.h.) This
ended up with the attached patch, which hasI think this kind of suggested change is why I have been wary of IWYU (and the
changes that clangd suggest): By moving indirect includes to .c files you end
up with implementation details more widely, which can increase the pain of
refactoring.
I'm actually not clear on what the policy of catalog/pg_somecatalog.h
versus catalog/pg_somecatalog_d.h is or should be. There doesn't appear
to be any consistency, other than that older code obviously is less
likely to use the _d.h headers.
If we have a policy, then adding some annotations might also be a good
way to describe it.
What were the parameters you used for IWYU? I'm e.g. a bit surprised about the
changes to main.c, adding an include for s_lock.h. For one I don't think
s_lock.h should ever be included directly, but more importantly it seems there
isn't a reason to include spin.h either, but it's not removed here?
I think the changes in main.c might have been my transcription error.
They don't make sense.
There are other changes I don't understand. E.g. why is
catalog/binary_upgrade.h removed from pg_enum.c? It's actually defining
binary_upgrade_next_pg_enum_oid, declared in catalog/binary_upgrade.h?
Ah, this is a deficiency in IWYU. It keeps headers that provide
function prototypes, but it doesn't keep headers that provide extern
declarations of global variables. I have filed an issue about that, and
it looks like a fix might already be on the way.[0]https://github.com/include-what-you-use/include-what-you-use/issues/1461
[0]: https://github.com/include-what-you-use/include-what-you-use/issues/1461
https://github.com/include-what-you-use/include-what-you-use/issues/1461
This issue also led me to discover -Wmissing-variable-declarations,
about which I will post separately.
In the meantime, here is an updated patch with rebase and the above
issues fixed.
Attachments:
v2-0001-Remove-unused-include-s-from-backend-.c-files.patchtext/plain; charset=UTF-8; name=v2-0001-Remove-unused-include-s-from-backend-.c-files.patchDownload+230-1015
On 19.02.24 08:59, Peter Eisentraut wrote:
There are other changes I don't understand. E.g. why is
catalog/binary_upgrade.h removed from pg_enum.c? It's actually defining
binary_upgrade_next_pg_enum_oid, declared in catalog/binary_upgrade.h?Ah, this is a deficiency in IWYU. It keeps headers that provide
function prototypes, but it doesn't keep headers that provide extern
declarations of global variables. I have filed an issue about that, and
it looks like a fix might already be on the way.[0][0]:
https://github.com/include-what-you-use/include-what-you-use/issues/1461This issue also led me to discover -Wmissing-variable-declarations,
about which I will post separately.In the meantime, here is an updated patch with rebase and the above
issues fixed.
Here is another rebase. Also, for extra caution, I undid all the
removals of various port(ability) includes, such as "port/atomics.h",
just in case they happen to have some impact in some obscure
configuration (= not covered by Cirrus CI).
I propose to commit this patch now, and then maybe come back with more
IWYU-related proposals in the future once the above issues are fixed.