Re: Potential ABI breakage in upcoming minor releases - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Potential ABI breakage in upcoming minor releases
Date
Msg-id CAEze2WgjQZJFnKwjRX9fUX0syW=qQbU+Y_4mkK9EN-oQB2D9_g@mail.gmail.com
Whole thread Raw
In response to Re: Potential ABI breakage in upcoming minor releases  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Potential ABI breakage in upcoming minor releases
List pgsql-hackers
On Tue, 26 Nov 2024 at 02:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> [ getting back to the document-ABI-breakage-rules-better topic ... ]
>
> I wrote:
> > That text says exactly nothing about what specific code changes to
> > make or not make.  I'm not sure offhand where (or if) we have this
> > documented, but there's an idea that adding fields at the end of
> > a struct is safer ABI-wise than putting them in the middle.  Which
> > is true if you can't squeeze them into padding space.  Here, that
> > could have been done and probably should have.
>
> I remembered where that's documented:
>
> https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching
>
> The current text there is:
>
>
> * You can only really change the signature of a function with local
>   linkage, perhaps with a few rare exceptions.
>
> * You cannot modify any struct definition in src/include/*. If any new
>   members must be added to a struct, put them at the end in
>   backbranches. It's okay to have a different struct layout in
>   master. Even then, extensions that allocate the struct can break via
>   a dependency on its size.
>
> * Move new enum values to the end.
>
>
> I propose rewriting and expanding that:
>
>
> * Don't change the signature of non-static functions.  One common
>   workaround is to change the existing function into a wrapper that
>   calls a new function with more/different arguments.
>
> * Don't change the contents of globally-visible structs, specifically
>   not the offsets of existing fields.  If you must add a new field,
>   the very best way is to put it into existing alignment padding
>   between fields.  (But consider both 32-bit and 64-bit cases when
>   deciding what is "padding".)  If that's not possible, it may be okay
>   to add the field at the end of the struct; but you have to worry
>   about whether any extensions depend on the size of the struct.
>   This will not work well if extensions allocate new instances of the
>   struct.  One particular gotcha is that it's not okay to change
>   sizeof(ResultRelInfo), as there are executor APIs that require
>   extensions to index into arrays of ResultRelInfos.

I would also add something like the following, to protect against
corruption and deserialization errors of the catalog pg_node_tree
fields, because right now the generated readfuncs.c code ignores field
names, assumes ABI field order, and is generally quite sensitive to
binary changes, which can cause corruption and/or errors during
readback of those nodes:

* If you update the contents of Nodes which are stored on disk as
pg_node_tree, you also have to make sure that the read function for
that node type is able to read both the old and the new serialized
format.

A reference to readfuncs.c/readfuncs.c, and/or the usage of
pg_node_tree for (among others) views, index/default expressions,
constraints, and partition bounds would maybe be useful as well.

> None of this is a substitute for installing some kind of ABI-checking
> infrastructure;

Agreed.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Doc: typo in config.sgml
Next
From: Nazir Bilal Yavuz
Date:
Subject: Re: Use streaming read API in pgstattuple.