[RFC] Clang plugin for catching suspicious typedef casting

Started by Dmitry Dolgovover 2 years ago10 messages
#1Dmitry Dolgov
9erthalion6@gmail.com
1 attachment(s)

Hi,

In the commit [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=126552c85c1cfb6ce6445159b8024cfa5631f33e one side change was to fix mixed up usage of
BlockNumber and Buffer variables. This was a one-off manual effort, and
indeed it hardly seems possible to get compilation warnings for such
code, as long as the underlying types could be converted in a standard
conforming way. As far as I see such conversions are acceptable and
used every now and then in the project, but they may hide some subtle
issues.

One interesting way I found to address this was to benefit from clang
plugin system [2]https://clang.llvm.org/docs/ClangPlugins.html. A clang plugin allows you to run some frontend
actions when compiling code with clang, and this approach is used in
some large open source projects (e.g. I've got my inspiration largely
from libreoffice [3]https://git.libreoffice.org/core/+/refs/heads/master/compilerplugins/clang/). After a little experimenting I could teach such a
plugin to warn about situations similar to the one described above. What
it does is:

* It visits all the function call expressions
* Verify if the function arguments are defined via typedef
* Compare the actual argument with the function declaration
* Consult with the suppression rules to minimize false positives

In the end the plugin catches the error mentioned above, and we get
something like this:

ginget.c:143:41: warning: Typedef check: Expected 'BlockNumber' (aka 'unsigned int'),
got 'Buffer' (aka 'int') in PredicateLockPage PredicateLockPage(btree->index, stack->buffer, snapshot);

Of course the plugin produces more warning, and I haven't checked all of
them yet -- some probably would have to be ignored as well. But in the
long run I'm pretty confident it's possible to fine tune this logic and
suppression rules to get minimum noise.

As I see it, there are advantages of using plugins in such way:

* Helps automatically detect some issues
* Other type of plugins could be useful to support large undertakings,
where a lot of code transformation is involved.

And of course disadvantages as well:

* It has to be fine-tuned to be useful
* It's compiler dependent, not clear how to always exercise it

I would like to get your opinion, folks. Does it sound interesting
enough for the community, is it worth it to pursue the idea?

Some final notes about infrastructure bits. Never mind cmake in there --
it was just for a quick start, I'm going to convert it to something
else. There are some changes needed to tell the compiler to actually
load the plugin, those of course could be done much better as well. I
didn't do anything with meson here, because it turns out meson tends to
enclose the plugin file with '-Wl,--start-group ... -Wl,--end-group' and
it breaks the plugin discovery.

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=126552c85c1cfb6ce6445159b8024cfa5631f33e
[2]: https://clang.llvm.org/docs/ClangPlugins.html
[3]: https://git.libreoffice.org/core/+/refs/heads/master/compilerplugins/clang/

Attachments:

0001-Typedef-check-clang-plugin.patchtext/x-diff; charset=us-asciiDownload
From 01e645f456f9476c60943f12d794465567f2d265 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthalion6@gmail.com>
Date: Thu, 3 Aug 2023 15:51:26 +0200
Subject: [PATCH] Typedef-check clang plugin

Add a clang plugin to warn about suspicious type casting for types
defined via typedef, e.g. BlockNumber and Buffer.
---
 configure                                     |  36 ++++
 configure.ac                                  |  12 ++
 src/tools/clang/typedef_check/CMakeLists.txt  |  31 +++
 .../clang/typedef_check/TypedefCheck.cpp      | 183 ++++++++++++++++++
 4 files changed, 262 insertions(+)
 create mode 100644 src/tools/clang/typedef_check/CMakeLists.txt
 create mode 100644 src/tools/clang/typedef_check/TypedefCheck.cpp

diff --git a/configure b/configure
index 2e518c8007d..71da30779de 100755
--- a/configure
+++ b/configure
@@ -767,6 +767,7 @@ enable_coverage
 GENHTML
 LCOV
 GCOV
+enable_typedef_check
 enable_debug
 enable_rpath
 default_port
@@ -835,6 +836,7 @@ enable_rpath
 enable_spinlocks
 enable_atomics
 enable_debug
+enable_typedef_check
 enable_profiling
 enable_coverage
 enable_dtrace
@@ -1528,6 +1530,7 @@ Optional Features:
   --disable-spinlocks     do not use spinlocks
   --disable-atomics       do not use atomic operations
   --enable-debug          build with debugging symbols (-g)
+  --enable-typedef-check  build with typedef-check plugin
   --enable-profiling      build with profiling enabled
   --enable-coverage       build with coverage testing instrumentation
   --enable-dtrace         build with DTrace support
@@ -3344,6 +3347,34 @@ fi
 
 
 
+#
+# --enable-typedef-check adds clang typedef-check plugin
+#
+
+
+# Check whether --enable-typedef-check was given.
+if test "${enable_typedef_check+set}" = set; then :
+  enableval=$enable_typedef_check;
+  case $enableval in
+    yes)
+      :
+      ;;
+    no)
+      :
+      ;;
+    *)
+      as_fn_error $? "no argument expected for --enable-typedef-check option" "$LINENO" 5
+      ;;
+  esac
+
+else
+  enable_typedef_check=no
+
+fi
+
+
+
+
 #
 # --enable-profiling enables gcc profiling
 #
@@ -7828,6 +7859,11 @@ if test "$enable_debug" = yes && test "$ac_cv_prog_cxx_g" = yes; then
   CXXFLAGS="$CXXFLAGS -g"
 fi
 
+if test "$enable_typedef_check" = yes; then
+  abs_path=`readlink -f $srcdir`
+  CFLAGS="$CFLAGS -Xclang -load -Xclang $abs_path/src/tools/clang/typedef_check/build/lib/TypedefCheck.so -Xclang -add-plugin -Xclang typedef-check"
+fi
+
 # enable code coverage if --enable-coverage
 if test "$enable_coverage" = yes; then
   if test "$GCC" = yes; then
diff --git a/configure.ac b/configure.ac
index 3ebe1a796dc..3b90bf9d1c5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -206,6 +206,13 @@ PGAC_ARG_BOOL(enable, debug, no,
               [build with debugging symbols (-g)])
 AC_SUBST(enable_debug)
 
+#
+# --enable-typedef-check adds clang typedef-check plugin
+#
+PGAC_ARG_BOOL(enable, typedef-check, no,
+              [build with typedef-check plugin])
+AC_SUBST(enable_typedef_check)
+
 #
 # --enable-profiling enables gcc profiling
 #
@@ -683,6 +690,11 @@ if test "$enable_debug" = yes && test "$ac_cv_prog_cxx_g" = yes; then
   CXXFLAGS="$CXXFLAGS -g"
 fi
 
+if test "$enable_typedef_check" = yes; then
+  abs_path=`readlink -f $srcdir`
+  CFLAGS="$CFLAGS -Xclang -load -Xclang $abs_path/src/tools/clang/typedef_check/build/lib/TypedefCheck.so -Xclang -add-plugin -Xclang typedef-check"
+fi
+
 # enable code coverage if --enable-coverage
 if test "$enable_coverage" = yes; then
   if test "$GCC" = yes; then
diff --git a/src/tools/clang/typedef_check/CMakeLists.txt b/src/tools/clang/typedef_check/CMakeLists.txt
new file mode 100644
index 00000000000..7508d9906f1
--- /dev/null
+++ b/src/tools/clang/typedef_check/CMakeLists.txt
@@ -0,0 +1,31 @@
+cmake_minimum_required(VERSION 3.20.0)
+project(Typedef)
+
+execute_process(COMMAND llvm-config --cmakedir
+                OUTPUT_VARIABLE LLVM_CMAKEDIR
+                OUTPUT_STRIP_TRAILING_WHITESPACE)
+
+execute_process(COMMAND llvm-config --includedir
+                OUTPUT_VARIABLE LLVM_INCLUDEDIR
+                OUTPUT_STRIP_TRAILING_WHITESPACE)
+
+message(STATUS ${LLVM_CMAKEDIR})
+message(STATUS ${LLVM_INCLUDEDIR})
+
+list(APPEND CMAKE_MODULE_PATH ${LLVM_CMAKEDIR})
+
+message(STATUS ${CMAKE_MODULE_PATH})
+
+set(LLVM_RUNTIME_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/bin)
+set(LLVM_LIBRARY_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib)
+set(LLVM_ENABLE_PLUGINS true)
+
+include(HandleLLVMOptions) # important: matches compiler flags to LLVM/Clang build
+
+include(AddLLVM)
+
+add_llvm_library(TypedefCheck MODULE TypedefCheck.cpp PLUGIN_TOOL clang)
+
+target_include_directories(TypedefCheck PUBLIC ${LLVM_INCLUDEDIR})
+
+target_compile_features(TypedefCheck PRIVATE cxx_std_17)
diff --git a/src/tools/clang/typedef_check/TypedefCheck.cpp b/src/tools/clang/typedef_check/TypedefCheck.cpp
new file mode 100644
index 00000000000..f41216ee7c3
--- /dev/null
+++ b/src/tools/clang/typedef_check/TypedefCheck.cpp
@@ -0,0 +1,183 @@
+/*
+ * Clang plugin for PostgreSQL to warn about suspicious misuse of types defined
+ * via typedef, e.g. BlockNumber and Buffer. Such validation could not be done
+ * without false positives, thus the plugin has to be tuned via suppression
+ * rules for types, pair of types and functions.
+ */
+
+#include "clang/Frontend/FrontendPluginRegistry.h"
+#include "clang/AST/AST.h"
+#include "clang/AST/ASTConsumer.h"
+#include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/Frontend/CompilerInstance.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace clang;
+
+namespace {
+
+class TypedefCheckConsumer : public ASTConsumer {
+  CompilerInstance &Instance;
+
+public:
+  TypedefCheckConsumer(CompilerInstance &Instance) : Instance(Instance) {}
+
+  void HandleTranslationUnit(ASTContext& context) override {
+    struct Visitor : public RecursiveASTVisitor<Visitor> {
+      std::set<StringRef> IgnoredTypes = {"size_t", "uintptr_t",
+                                          "float8", "uint8", "int8",
+                                          "uint16", "int16",
+                                          "uint32", "int32",
+                                          "uint64", "int64",
+                                          "__m128i", "Vector8",
+                                          "__off_t", "__off64_t",
+                                          "__uid_t", "__mode_t",
+                                          "__gid_t", "key_t", "__pid_t",
+                                          "__time_t", "Datum",
+                                          "yy_size_t", "pg_wchar",
+                                          "wchar_t", "fmStringInfo"};
+
+      std::set<std::string> IgnoredFunctions = {"FunctionalCall1Coll", "FunctionalCall2Coll",
+                                                "fmgr_info_cxt", "fmgr_info",
+                                                "check_amproc_signature", "format_procedure",
+                                                "check_amoptsproc_signature",
+                                                "check_hash_func_signature"};
+
+      std::set<std::pair<StringRef, StringRef>> IgnoredPairs = {
+        std::pair<StringRef, StringRef>("RegProcedure", "Oid"),
+        std::pair<StringRef, StringRef>("Oid", "RegProcedure"),
+        std::pair<StringRef, StringRef>("Timestamp", "TimestampTz"),
+        std::pair<StringRef, StringRef>("TimestampTz", "Timestamp"),
+      };
+
+      Visitor(DiagnosticsEngine &Diags) : Diags(Diags) {}
+
+      bool VisitCallExpr(CallExpr *expr) {
+        const auto funcDecl = expr->getDirectCallee();
+        unsigned DiagID = Diags.getCustomDiagID(DiagnosticsEngine::Warning,
+                                            "Typedef check: Expected %0, got %1 in %2");
+
+        if (funcDecl == nullptr)
+            return true;
+
+        for (unsigned int i = 0; i != expr->getNumArgs(); ++i) {
+            /* Declared argument */
+            ParmVarDecl const* argDecl;
+            QualType argDeclT;
+            QualType argDeclTypedefT;
+            TypedefNameDecl const* argDeclTypedef;
+
+            /* Actually provided value */
+            Expr *arg;
+            QualType argT;
+            QualType argTypedefT;
+            TypedefNameDecl const* argTypedef;
+
+            arg = expr->getArg(i)->IgnoreParenImpCasts();
+            argT = arg->getType();
+
+            if (i >= funcDecl->getNumParams())
+                continue;
+
+            argDecl = funcDecl->getParamDecl(i);
+            argDeclT = argDecl->getOriginalType();
+
+            if (argT.isNull())
+                continue;
+
+            if (argDeclT.isNull())
+                continue;
+
+            /* Verify if we deal with a typedef */
+            if (auto const t = argT.getTypePtr()->getAs<TypedefType>()) {
+                argTypedef = t->getDecl();
+                argTypedefT = t->desugar();
+            }
+            else
+                continue;
+
+            if (auto const t = argDeclT.getTypePtr()->getAs<TypedefType>()) {
+                argDeclTypedef = t->getDecl();
+                argDeclTypedefT = t->desugar();
+            }
+            else
+                continue;
+
+            /*
+             * If both types (actual argument type and declared one) are not
+             * builtin and could be desugarized into the same type, ignore it.
+             */
+            if (!argDeclTypedefT->isBuiltinType() &&
+                !argTypedefT->isBuiltinType() &&
+                (argDeclTypedefT.getAsString() == argTypedefT.getAsString()))
+                continue;
+
+            /*
+             * If one desugarized type (either the declared or actual) is the
+             * same as not desugarized another one, ignore it.
+             */
+            if (argDeclTypedefT.getAsString() == argTypedef->getName())
+                continue;
+
+            if (argTypedefT.getAsString() == argDeclTypedef->getName())
+                continue;
+
+            /* If it is one of the suppressed functions, ignore it. */
+            if (IgnoredFunctions.find(funcDecl->getNameInfo().getAsString()) != IgnoredFunctions.end())
+                continue;
+
+            /* If it is one of the suppressed types, ignore it. */
+            if (IgnoredTypes.find(argTypedef->getName()) != IgnoredTypes.end())
+                continue;
+
+            if (IgnoredTypes.find(argDeclTypedef->getName()) != IgnoredTypes.end())
+                continue;
+
+            /* If it is one of the suppressed pair of types, ignore it. */
+            if (IgnoredPairs.find(std::pair<StringRef, StringRef>(argTypedef->getName(), argDeclTypedef->getName())) != IgnoredPairs.end())
+                continue;
+
+            if (argTypedef->getName() != argDeclTypedef->getName())
+            {
+                Diags.Report(arg->getExprLoc(), DiagID)
+                    << argDecl->getOriginalType()
+                    << arg->getType()
+                    << funcDecl->getNameInfo().getAsString();
+
+            }
+        }
+
+        return true;
+      }
+
+      std::set<FunctionDecl*> LateParsedDecls;
+      DiagnosticsEngine &Diags;
+
+    } v(Instance.getDiagnostics());
+    v.TraverseDecl(context.getTranslationUnitDecl());
+  }
+};
+
+class TypedefCheckAction : public PluginASTAction {
+protected:
+  std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
+                                                 llvm::StringRef) override {
+    return std::make_unique<TypedefCheckConsumer>(CI);
+  }
+
+  bool ParseArgs(const CompilerInstance &CI,
+                 const std::vector<std::string> &args) override {
+      return true;
+  }
+
+  void PrintHelp(llvm::raw_ostream& ros) {
+    ros << "TypedefCheck plugin for PostgreSQL warns about suspicious "
+           "misuse of types defined via typedef.\n";
+  }
+
+};
+
+}
+
+static FrontendPluginRegistry::Add<TypedefCheckAction>
+X("typedef-check", "check typedefs");

base-commit: 74a2dfee2255a1bace9b0053d014c4efa2823f4d
-- 
2.32.0

#2Tristan Partin
tristan@neon.tech
In reply to: Dmitry Dolgov (#1)
Re: [RFC] Clang plugin for catching suspicious typedef casting

This is the first I am learning about clang plugins. Interesting
concept. Did you give any thought to using libclang instead of
a compiler plugin? I am kind of doing similar work, but I went with
libclang instead of a plugin.

--
Tristan Partin
Neon (https://neon.tech)

#3Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Tristan Partin (#2)
Re: [RFC] Clang plugin for catching suspicious typedef casting

On Thu, Aug 03, 2023 at 12:23:52PM -0500, Tristan Partin wrote:

This is the first I am learning about clang plugins. Interesting concept.
Did you give any thought to using libclang instead of a compiler plugin? I
am kind of doing similar work, but I went with libclang instead of a plugin.

Nope, never thought about trying libclang. From the clang documentation
it seems a plugin is a suitable interface if one wants to:

special lint-style warnings or errors for your project

Which is what I was trying to achieve. Are there any advantages of
libclang that you see?

#4Peter Eisentraut
peter@eisentraut.org
In reply to: Dmitry Dolgov (#1)
Re: [RFC] Clang plugin for catching suspicious typedef casting

On 03.08.23 18:56, Dmitry Dolgov wrote:

I would like to get your opinion, folks. Does it sound interesting
enough for the community, is it worth it to pursue the idea?

I think it's interesting, and doesn't look too invasive.

Maybe we can come up with three or four interesting use cases and try to
implement them. BlockNumber vs. Buffer type checking is obviously a bit
marginal to get anyone super-excited, but it's a reasonable demo.

Also, how stable is the plugin API? Would we need to keep updating
these plugins for each new clang version?

#5Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Peter Eisentraut (#4)
Re: [RFC] Clang plugin for catching suspicious typedef casting

On Wed, Aug 09, 2023 at 03:23:32PM +0200, Peter Eisentraut wrote:
On 03.08.23 18:56, Dmitry Dolgov wrote:

I would like to get your opinion, folks. Does it sound interesting
enough for the community, is it worth it to pursue the idea?

I think it's interesting, and doesn't look too invasive.

Maybe we can come up with three or four interesting use cases and try to
implement them. BlockNumber vs. Buffer type checking is obviously a bit
marginal to get anyone super-excited, but it's a reasonable demo.

Also, how stable is the plugin API? Would we need to keep updating these
plugins for each new clang version?

From the first glance the API itself seems to be stable -- I didn't find
many breaking changes for plugins in the release notes over the last
five releases. The git history for such definitions as ASTConsumer or
RecursiveASTVisitor also doesn't seem to be very convoluted.

But libreoffice example shows, that some updating would be necessary --
they've got about a dozen of places in the code that depend on the clang
version. From what I see it's usually not directly related to the plugin
API, and looks more like our opaque pointers issue.

#6Xing Guo
higuoxing@gmail.com
In reply to: Dmitry Dolgov (#5)
Re: [RFC] Clang plugin for catching suspicious typedef casting

Hi,

On 8/10/23, Dmitry Dolgov <9erthalion6@gmail.com> wrote:

On Wed, Aug 09, 2023 at 03:23:32PM +0200, Peter Eisentraut wrote:
On 03.08.23 18:56, Dmitry Dolgov wrote:

I would like to get your opinion, folks. Does it sound interesting
enough for the community, is it worth it to pursue the idea?

I think it's interesting, and doesn't look too invasive.

+1.

Maybe we can come up with three or four interesting use cases and try to
implement them. BlockNumber vs. Buffer type checking is obviously a bit
marginal to get anyone super-excited, but it's a reasonable demo.

Several months ago, I implemented a clang plugin[1]https://github.com/higuoxing/clang-plugins/blob/main/lib/ReturnInPgTryBlockChecker.cpp for checking
suspicious return/goto/break/continue in PG_TRY/PG_CATCH blocks and it
found unsafe codes in Postgres[2]/messages/by-id/CACpMh+CMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ@mail.gmail.com. It's implemented using clang's AST
matcher API.

I would like to contribute it well if this RFC can be accepted.

[1]: https://github.com/higuoxing/clang-plugins/blob/main/lib/ReturnInPgTryBlockChecker.cpp
[2]: /messages/by-id/CACpMh+CMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ@mail.gmail.com

--
Best Regards,
Xing

#7Tristan Partin
tristan@neon.tech
In reply to: Dmitry Dolgov (#3)
Re: [RFC] Clang plugin for catching suspicious typedef casting

On Fri Aug 4, 2023 at 5:47 AM CDT, Dmitry Dolgov wrote:

On Thu, Aug 03, 2023 at 12:23:52PM -0500, Tristan Partin wrote:

This is the first I am learning about clang plugins. Interesting concept.
Did you give any thought to using libclang instead of a compiler plugin? I
am kind of doing similar work, but I went with libclang instead of a plugin.

Nope, never thought about trying libclang. From the clang documentation
it seems a plugin is a suitable interface if one wants to:

special lint-style warnings or errors for your project

Which is what I was trying to achieve. Are there any advantages of
libclang that you see?

Only advantage I see to using libclang is that you can run programs
built with libclang regardless of what your C compiler is. I typically
use GCC.

I think your idea of automating this kind of thing is great no matter
how it is implemented.

--
Tristan Partin
Neon (https://neon.tech)

#8Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Tristan Partin (#7)
Re: [RFC] Clang plugin for catching suspicious typedef casting

On Fri, Dec 01, 2023 at 04:01:05PM -0600, Tristan Partin wrote:
On Fri Aug 4, 2023 at 5:47 AM CDT, Dmitry Dolgov wrote:

On Thu, Aug 03, 2023 at 12:23:52PM -0500, Tristan Partin wrote:

This is the first I am learning about clang plugins. Interesting concept.
Did you give any thought to using libclang instead of a compiler plugin? I
am kind of doing similar work, but I went with libclang instead of a plugin.

Nope, never thought about trying libclang. From the clang documentation
it seems a plugin is a suitable interface if one wants to:

special lint-style warnings or errors for your project

Which is what I was trying to achieve. Are there any advantages of
libclang that you see?

Only advantage I see to using libclang is that you can run programs built
with libclang regardless of what your C compiler is. I typically use GCC.

I think your idea of automating this kind of thing is great no matter how it
is implemented.

Yeah, absolutely.

Sorry, haven't had a chance to follow up on the patch, but I'm planing
to do so. I guess the important part, as Peter mentioned above in the
thread, is to figure out more use cases which could be usefully
augmented with such compile time checks. At the moment I don't have any
other except BlockNumber vs Buffer, so I'm going to search through the
hackers looking for more potential targets. If anyone got a great idea
right away about where compile-time checks could be useful, feel free to
share!

In reply to: Dmitry Dolgov (#8)
Re: [RFC] Clang plugin for catching suspicious typedef casting

On Sun, Dec 3, 2023 at 6:31 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:

Only advantage I see to using libclang is that you can run programs built
with libclang regardless of what your C compiler is. I typically use GCC.

I think your idea of automating this kind of thing is great no matter how it
is implemented.

Yeah, absolutely.

What is the difference between a clang plugin (or whatever this is),
and a custom clang-tidy check? Are they the same thing?

I myself use clang-tidy through clangd. It has a huge number of
checks, plus some custom checks that are only used by certain open
source projects.

An example of a check that seems like it would be interesting to
Postgres hackers:

https://releases.llvm.org/17.0.1/tools/clang/tools/extra/docs/clang-tidy/checks/bugprone/signal-handler.html

An example of a custom clang-tidy check, used for the Linux Kernel:

https://releases.llvm.org/17.0.1/tools/clang/tools/extra/docs/clang-tidy/checks/linuxkernel/must-use-errs.html

Your idea of starting with a check that identifies when BlockNumber
and Buffer variables were confused seems like the right general idea
to me. It's just about impossible for this to be a false positive in practice,
which is important. But, I wonder, is there a way of accomplishing the
same thing without any custom code? Seems like the general idea of not
confusing two types like BlockNumber and Buffer might be a generic
one?

Some random ideas in this area:

* A custom check that enforces coding standards within signal handlers
-- the generic one that I linked to might need to be customized, in
whatever way (don't use it myself).

* A custom check that enforces a coding standard that applies within
critical sections.

We already have an assertion that catches memory allocations made
within a critical section. It might make sense to have tooling that
can detect other kinds of definitely-unsafe things in critical
sections. For example, maybe it would be possible to detect when
XLogRegisterBufData() has been passed a pointer to something on the
stack that will go out of scope, in a way that leaves open the
possibility of reading random stuff from the stack once inside
XLogInsert(). AddressSanitizer's use-after-scope can detect that sort
of thing, though not reliably.

Not at all sure about how feasible any of this is...

--
Peter Geoghegan

#10Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Peter Geoghegan (#9)
Re: [RFC] Clang plugin for catching suspicious typedef casting

On Sun, Dec 03, 2023 at 07:02:55PM -0800, Peter Geoghegan wrote:
On Sun, Dec 3, 2023 at 6:31 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:

Only advantage I see to using libclang is that you can run programs built
with libclang regardless of what your C compiler is. I typically use GCC.

I think your idea of automating this kind of thing is great no matter how it
is implemented.

Yeah, absolutely.

What is the difference between a clang plugin (or whatever this is),
and a custom clang-tidy check? Are they the same thing?

Good question. Obviously one difference is that a clang plugin is more
tightly integrated into the build process, where a custom clang-tidy
check could be used independently. IIUC they also use different API, AST
Consumer vs AST Matchers, but so far I have no idea what consequences
does it have.

Clang tooling page has this to say about choosing the right interface
[1]: https://clang.llvm.org/docs/Tooling.html

Clang Plugins allow you to run additional actions on the AST as part
of a compilation. Plugins are dynamic libraries that are loaded at
runtime by the compiler, and they’re easy to integrate into your
build environment.

Canonical examples of when to use Clang Plugins:
* special lint-style warnings or errors for your project creating
* additional build artifacts from a single compile step

[...]

LibTooling is a C++ interface aimed at writing standalone tools, as
well as integrating into services that run clang tools. Canonical
examples of when to use LibTooling:

* a simple syntax checker
* refactoring tools

I myself use clang-tidy through clangd. It has a huge number of
checks, plus some custom checks that are only used by certain open
source projects.

An example of a check that seems like it would be interesting to
Postgres hackers:

https://releases.llvm.org/17.0.1/tools/clang/tools/extra/docs/clang-tidy/checks/bugprone/signal-handler.html

An example of a custom clang-tidy check, used for the Linux Kernel:

https://releases.llvm.org/17.0.1/tools/clang/tools/extra/docs/clang-tidy/checks/linuxkernel/must-use-errs.html

Yeah, those look interesting, and might be even worth including
independently of the topic in this thread.

Your idea of starting with a check that identifies when BlockNumber
and Buffer variables were confused seems like the right general idea
to me. It's just about impossible for this to be a false positive in practice,
which is important. But, I wonder, is there a way of accomplishing the
same thing without any custom code? Seems like the general idea of not
confusing two types like BlockNumber and Buffer might be a generic
one?

Agree, the example in this patch could be generalized.

* A custom check that enforces coding standards within signal handlers
-- the generic one that I linked to might need to be customized, in
whatever way (don't use it myself).

* A custom check that enforces a coding standard that applies within
critical sections.

We already have an assertion that catches memory allocations made
within a critical section. It might make sense to have tooling that
can detect other kinds of definitely-unsafe things in critical
sections. For example, maybe it would be possible to detect when
XLogRegisterBufData() has been passed a pointer to something on the
stack that will go out of scope, in a way that leaves open the
possibility of reading random stuff from the stack once inside
XLogInsert(). AddressSanitizer's use-after-scope can detect that sort
of thing, though not reliably.

Interesting ideas, thanks for sharing. I'm going to try implementing
some of those.

[1]: https://clang.llvm.org/docs/Tooling.html