Re: Coccinelle for PostgreSQL development [1/N]: coccicheck.py - Mailing list pgsql-hackers

From Mats Kindahl
Subject Re: Coccinelle for PostgreSQL development [1/N]: coccicheck.py
Date
Msg-id CA+144273GOMq5NOc+ULbwxTk+Z1_1rwcpMs0gr8Yr6r9T=OjvQ@mail.gmail.com
Whole thread Raw
In response to Re: Coccinelle for PostgreSQL development [1/N]: coccicheck.py  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Sat, Jan 18, 2025 at 8:44 PM Mats Kindahl <mats@timescale.com> wrote:
On Tue, Jan 14, 2025 at 4:19 PM Aleksander Alekseev <aleksander@timescale.com> wrote:
IMO the best solution would be re-submitting all the patches to this
thread. Also please make sure the patchset is registered on the
nearest open CF [1] This will ensure that the patchset is built on our
CI (aka cfbot [2]) and will not be lost.

[1]: https://commitfest.postgresql.org/
[2]: http://cfbot.cputube.org/



Hi all,

Here is a new set of patches rebased on the latest version of the Postgres. I decided to just include the semantic patches in each patch since it is trivial to generate the patch and build using:

ninja coccicheck-patch | patch -d .. -p1 && ninja

I repeat the description from the previous patch set and add comments where things have changed, but I have also added two semantic patches, which are described last.

For those of you that are not aware of it: Coccinelle is a tool for pattern matching and text transformation for C code and can be used for detection of problematic programming patterns and to make complex, tree-wide patches easy. It is aware of the structure of C code and is better suited to make complicated changes than what is possible using normal text substitution tools like Sed and Perl. I've noticed it's been used at a few cases way back to fix issues.[1]

Coccinelle have been successfully been used in the Linux project since 2008 and is now an established tool for Linux development and a large number of semantic patches have been added to the source tree to capture everything from generic issues (like eliminating the redundant A in expressions like "!A || (A && B)") to more Linux-specific problems like adding a missing call to kfree().

Although PostgreSQL is nowhere the size of the Linux kernel, it is nevertheless of a significant size and would benefit from incorporating Coccinelle into the development. I noticed it's been used in a few cases way back (like 10 years back) to fix issues in the PostgreSQL code, but I thought it might be useful to make it part of normal development practice to, among other things:

- Identify and correct bugs in the source code both during development and review.
- Make large-scale changes to the source tree to improve the code based on new insights.
- Encode and enforce APIs by ensuring that function calls are used correctly.
- Use improved coding patterns for more efficient code.
- Allow extensions to automatically update code for later PostgreSQL versions.

To that end, I created a series of patches to show how it could be used in the PostgreSQL tree. It is a lot easier to discuss concrete code and I split it up into separate messages since that makes it easier to discuss each individual patch. The series contains code to make it easy to work with Coccinelle during development and reviews, as well as examples of semantic patches that capture problems, demonstrate how to make large-scale changes, how to enforce APIs, and also improve some coding patterns.

The first three patches contain the coccicheck.py script and the integration with the build system (both Meson and Autoconf).

# Coccicheck Script

It is a re-implementation of the coccicheck script that the Linux kernel uses. We cannot immediately use the coccicheck script since it is quite closely tied to the Linux source code tree and we need to have something that both supports Autoconf and Meson. Since Python seems to be used more and more in the tree, it seems to be the most natural choice. (I have no strong opinion on what language to use, but think it would be good to have something that is as platform-independent as possible.)

The intention is that we should be able to use the Linux semantic patches directly, so it supports the "Requires" and "Options" keywords, which can be used to require a specific version of spatch(1) and add options to the execution of that semantic patch, respectively.

I have added support for using multiple jobs similar to how "make -jN" works. This is also supported by the autoconf and ninja builds
 
# Autoconf support

The changes to Autoconf modifies the configure.ac and related files (in particular Makefile.global.in). At this point, I have deliberately not added support for pgxs so extensions cannot use coccicheck through the PostgreSQL installation. This is something that we can add later though.

The semantic patches are expected to live in cocci/ directory under the root and the patch uses the pattern cocci/**/*.cocci to find all semantic patches. Right now there are no subdirectories for the semantic patches, but this might be something we want to add to create different categories of scripts.

The coccicheck target is used in the same way as for the Linux kernel, that is, to generate and apply all patches suggested by the semantic patches, you type:

    make coccicheck MODE=patch | patch -p1

Linux as support for a few more variables: V to set the verbosity, J to use multiple jobs for processing the semantic patches, M to select a different directory to apply the semantic patches to, and COCCI to use a single specific semantic patch rather than all available. I have not added support for this right now, but if you think this is valuable, it should be straightforward to add.

I used autoconf 2.69, as mentioned in configure.ac, but that generate a bigger diff than I expected. Any advice here is welcome.

Using the parameter "JOBS" allow you to use multiple jobs, e.g.:

 make coccicheck MODE=patch JOBS=4 | patch -p1


# Meson Support

The support for Meson is done by adding three coccicheck targets: one for each mode. To apply all patches suggested by the semantic patches using ninja (as is done in [2]), you type the following in the build directory generated by Meson (e.g., the "build/" subdirectory).

    ninja coccicheck-patch | patch -p1 -d ..

If you want to pass other flags you have to set the SPFLAGS environment variable when calling ninja:

    SPFLAGS=--debug ninja coccicheck-report

If you want to use multiple jobs, you use something like this:

JOBS=4 ninja coccicheck-patch | patch -d .. -p1
 
# Semantic Patch: Wrong type for palloc()

This is the first example of a semantic patch and shows how to capture and fix a common problem.

If you use an palloc() to allocate memory for an object (or an array of objects) and by mistake type something like:

    StringInfoData *info = palloc(sizeof(StringInfoData*));

You will not allocate enough memory for storing the object. This semantic patch catches any cases where you are either allocating an array of objects or a single object that do not have corret types in this sense, more precisely, it captures assignments to a variable of type T* where palloc() uses sizeof(T) either alone or with a single expression (assuming this is an array count).

The semantic patch is overzealous in the sense that using the wrong typedef will suggest a change (this can be seen in the patch). Although the sizes of these are the same, it is probably be better to just follow the convention of always using the type "T*" for any "palloc(sizeof(T))" since the typedef can change at any point and would then introduce a bug. Coccicheck can easily fix this for you, so it is straightforward to enforce this. It also simplifies other automated checking to follow this convention.

We don't really have any real bugs as a result from this, but we have one case where an allocation of "sizeof(LLVMBasicBlockRef*)" is allocated to an "LLVMBasicBlockRef*", which strictly speaking is not correct (it should be "sizeof(LLVMBasicBlockRef)"). However, since they are both pointers, there is no risk of incorrect allocation size. One typedef usage that does not match.

# Semantic Patch: Introduce palloc_array() and palloc_object() where possible

This is an example of a large-scale refactoring to improve the code.

For PostgreSQL 16, Peter extended the palloc()/pg_malloc() interface in commit 2016055a92f to provide more type-safety, but these functions are not widely used. This semantic patch captures and replaces all uses of palloc() where palloc_array() or palloc_object() could be used instead. It deliberately does not touch cases where it is not clear that the replacement can be done.

# Semantic Patch: replace code with pg_cmp_*

This is an example of a large-scale refactoring to improve the code.

In commit 3b42bdb4716 and 6b80394781c overflow-safe comparison functions were introduced, but they are not widely used. This semantic patch identifies some of the more common cases and replaces them with calls to the corresponding pg_cmp_* function.

The patches give a few instructions of improvement in performance when checking with Godbolt. It's not much, but since it is so easy to apply them, it might still be worthwhile.
 
# Semantic Patch: Replace dynamic allocation of StringInfo with StringInfoData

Use improved coding patterns for more efficient code.

This semantic patch replaces uses of StringInfo with StringInfoData where the info is dynamically allocated but (optionally) freed at the end of the block. This will avoid one dynamic allocation that otherwise has to be dealt with.

For example, this code:

    StringInfo info = makeStringInfo();
    ...
    appendStringInfo(info, ...);
    ...
    return do_stuff(..., info->data, ...);

Can be replaced with:

    StringInfoData info;
    initStringInfo(&info);
    ...
    appendStringInfo(&info, ...);
    ...
    return do_stuff(..., info.data, ...);

It does not do a replacement in these cases:
  • If the variable is assigned to an expression. In this case, the pointer can "leak" outside the function either through a global variable or a parameter assignment.
  • If an assignment is done to the expression. This cannot leak the data, but could mean a value-assignment of a structure, so we avoid this case.
  • If the pointer is returned.
The cases that this semantic patch fixed when I uploaded the first version of the other patches seems to have been dealt with, but having it as part of the code base prevents such cases from surfacing again.
 
--
Best wishes,
Mats Kindahl, Timescale
Attachment

pgsql-hackers by date:

Previous
From: Julien Tachoires
Date:
Subject: Re: Allow table AMs to define their own reloptions
Next
From: Magnus Hagander
Date:
Subject: Re: Statistics Import and Export