I wrote a small Clang-based tool to analyze C code to find potentially
problematic assign-through pointer expressions.
The assignments I want to find look like this:
PTR(x)->y = z where
PTR is a macro that dereferences some pointer expression
As the order in which the left- and right-hand sides
of an assignment are evaluated in C is undefined this could lead to memory
errors if the
z expression contains side effects that affect pointer derferenced in
the left-hand side.
The code I’m analyzing is a lare C project project that uses its own garbage
collector (GC) where
PTR is a macro for dereferencing a GC object. There
were some bugs caused in our test suite by through-pointer assignments like the
above one when when running the tests on an ARM-based platform (the bug did not
occur in the normal x86-based test environment).
Clang AstMatcher API
Finding these kinds of expressions with Clang is pretty easy. Clang has an API for AST matching that allows us to write matchers such as this:
This matcher finds all expressions as described above where the right-hand side additionally contains a function call.
Running my tool on each continuous integration build would ensure that we don’t encounter this particular type of bug again in the project.
Although my tool was pretty simple, with less than 100 lines of code, figuring out how to build it proved to be more challenging that just writing the tool itself.
Although it is possible to compile a Clang tool using LibTooling against pre-built LLVM/Clang binaries that are available in Debian, e.g., this is not really covered in the LLVM documentation.
Most pages in the documentation at
llvm.org assume that you’ve built LLVM
from source, rather than using pre-built binaries.
I tried building LLVM from source on my laptop at first but quickly gave up when the build soon ran out of memory. It turns out that you need about 30-40 GB of working memory and about 120 GB of free disk space to build the current latest version of LLVM (17.0.6). If you don’t have enough RAM you have to rely on Swap space. The source code takes about 2 GB of space if downloaded as a shallow Git clone and the build directory of a completed build on my machine was 113 GB.
In the end I did manage to build my tool against the pre-built LLVM-15 Debian
packages although finding out how to correctly link the code was tricky. The
llvm-config can be used to generate compiler flags for compiling/linking
a standalone tool but I did not know where to find the list of
LLVM components I needed to link with.
I tried to link against all 90
libclang*.a files in
however, that failed with lots of undefined symbol errors from the linker. This
seemed strange as using
nm turned up matching symbols in those libraries. It
turns out that the LLVM libraries have circular dependencies and to solve these
the linker will have to iteratively resolve names among the libraries.
Normally the linker only does one pass over the listed libraries which then
fails on the first undefined symbol. To handle circular references
one can use the
-Wl,--end-group flags surrounding the
list of libraries that contain circular references
This solved the linking problems so I was finally able to
link my tool. I narrowed down the list of libraries a bit and ended up with the
following magical incantation:
g++ -g `llvm-config-15 --cxxflags` -o tool.o -c tool.cc
g++ -g `llvm-config-15 --ldflags` -o tool tool.o \
-lclang -lclangSupport -lclangAST -lclangASTMatchers \
-lclangFrontend -lclangDriver -lclangParse -lclangSerialization \
-lclangSema -lclangAnalysis -lclangEdit -lclangAST -lclangLex -lclangBasic \
-lclangTooling -lclangToolingCore -lclangToolingSyntax -lclangToolingASTDiff \
`llvm-config-15 --libs engine`
The LibTooling page contains an outdated example code snippet:
CommonOptionsParser class should not be instantiated directly. It has
been made protected in some previous version of LLVM. Instead the builder
::create should be called:
Putting it all Together
Here is the full code of an initial version of my tool. The code compiles against LLVM 15.0.7 and may not work with other versions.
I’ve made some improvements to the tool which are not included but it should work as a starting point if you are looking to make a similar Clang tool.