Thread: Potential ABI breakage in upcoming minor releases
Hello,
Commit 51ff46de29f67d73549b2858f57e77ada8513369 (backported all the way back to v12) added a new member to `ResultRelInfo` struct. This can potentially cause ABI breakage for the extensions that allocate the struct and pass it down to the PG code. The previously built extensions may allocate a shorter struct, while the new PG code would expect a larger struct, thus overwriting some memory unintentionally.
A better approach may have been what Tom did in 8cd190e13a22dab12e86f7f1b59de6b9b128c784, but I understand it might be too late to change this since the releases are already tagged. Nevertheless, I thought of bringing it up if others have different views.
Thanks,
Pavan
On Thu, Nov 14, 2024 at 04:18:02PM +0530, Pavan Deolasee wrote: > Commit 51ff46de29f67d73549b2858f57e77ada8513369 (backported all the way > back to v12) added a new member to `ResultRelInfo` struct. This can > potentially cause ABI breakage for the extensions that allocate the struct > and pass it down to the PG code. The previously built extensions may > allocate a shorter struct, while the new PG code would expect a larger > struct, thus overwriting some memory unintentionally. > > A better approach may have been what Tom did in > 8cd190e13a22dab12e86f7f1b59de6b9b128c784, but I understand it might be too > late to change this since the releases are already tagged. Nevertheless, I > thought of bringing it up if others have different views. The release is now announced. We could issue new releases (likely next Thursday) if there's a problem that warrants it. Based on a grep of PGXN code, here are some or all of the modules that react to sizeof(ResultRelInfo): $ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo)); pg_pathman:: ResultRelInfo *resultRelInfo = (ResultRelInfo *) palloc(sizeof(ResultRelInfo)); $ grepx -r 'Node.*ResultRelInfo' | tee /tmp/2 | sed 's/-[^:]*/:/'|sort -u apacheage:: cypher_node->resultRelInfo = makeNode(ResultRelInfo); apacheage:: cypher_node->resultRelInfo = makeNode(ResultRelInfo); citus:: resultRelInfo = makeNode(ResultRelInfo); citus:: ResultRelInfo *resultRelInfo = makeNode(ResultRelInfo); pg_bulkload:: checker->resultRelInfo = makeNode(ResultRelInfo); pg_bulkload:: self->relinfo = makeNode(ResultRelInfo); pg_pathman:: child_result_rel_info = makeNode(ResultRelInfo); pg_pathman:: parent_result_rel = makeNode(ResultRelInfo); pg_pathman:: parent_rri = makeNode(ResultRelInfo); pg_pathman:: part_result_rel_info = makeNode(ResultRelInfo); vops:: resultRelInfo = makeNode(ResultRelInfo); I don't know whether we should make a new release, amend the release announcement to call for extension rebuilds, or just stop here. https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching mentions the problem, but neither it nor the new standard at postgr.es/c/e54a42a say how reticent we'll be to add to the end of a struct on which extensions do sizeof. The postgr.es/c/e54a42a standard would have us stop here. But I'm open to treating the standard as mistaken and changing things.
On Thu, 14 Nov 2024 at 16:35, Noah Misch <noah@leadboat.com> wrote:
Based on a grep of PGXN code, here are some or all of the modules that react
to sizeof(ResultRelInfo):
To add to this list, Christoph Berg confirmed that timescaledb test suite crashes. [1]
Regards,
Ants Aasma
Hi, > To add to this list, Christoph Berg confirmed that timescaledb test suite crashes. [1] Yes changing ResultRelInfo most definetely breaks TimescaleDB. The extension uses makeNode(ResultRelInfo) and this can't end-up well: ``` static inline Node * newNode(size_t size, NodeTag tag) { Node *result; Assert(size >= sizeof(Node)); /* need the tag, at least */ result = (Node *) palloc0(size); result->type = tag; return result; } #define makeNode(_type_) ((_type_ *) newNode(sizeof(_type_),T_##_type_)) ``` -- Best regards, Aleksander Alekseev
On 14.11.24 15:35, Noah Misch wrote: > The postgr.es/c/e54a42a standard would have us stop here. But I'm open to > treating the standard as mistaken and changing things. That text explicitly calls out that adding struct members at the end of a struct is considered okay. But thinking about it now, even adding fields to the end of a node struct that extensions allocate using makeNode() is an ABI break that is liable to cause all affected extensions to break in a crashing way.
Re: Noah Misch > Based on a grep of PGXN code, here are some or all of the modules that react > to sizeof(ResultRelInfo): > > $ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u > apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo)); Confirmed, crashing: AGE for 14..17 (12..13 seem fine) > citus:: resultRelInfo = makeNode(ResultRelInfo); > citus:: ResultRelInfo *resultRelInfo = makeNode(ResultRelInfo); > pg_bulkload:: checker->resultRelInfo = makeNode(ResultRelInfo); > pg_bulkload:: self->relinfo = makeNode(ResultRelInfo); > pg_pathman:: child_result_rel_info = makeNode(ResultRelInfo); > pg_pathman:: parent_result_rel = makeNode(ResultRelInfo); > pg_pathman:: parent_rri = makeNode(ResultRelInfo); > pg_pathman:: part_result_rel_info = makeNode(ResultRelInfo); > vops:: resultRelInfo = makeNode(ResultRelInfo); (These are not on apt.pg.o) I've also tested other packages where ResultRelInfo appears in the source, but they all passed the tests (most have decent test but a few have not): Bad: postgresql-16-age timescaledb Good: hypopg libpg-query pglast pglogical pgpool2 pgsql-ogr-fdw pg-squeeze postgresql-mysql-fdw (impossible to test sanely because mysql/mariadb take turns at being the default, and in some environments just don't start) > I don't know whether we should make a new release, amend the release > announcement to call for extension rebuilds, or just stop here. > https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching > mentions the problem, but neither it nor the new standard at > postgr.es/c/e54a42a say how reticent we'll be to add to the end of a struct on > which extensions do sizeof. I'd say the ship has sailed, a new release would now break things the other way round. Christoph
Timescale at least has many users, I don't know about the others. But they won't be happy if we let things be. IMO we should rewrap ... especially 12, since there won't be a next one. ISTM that we have spare bytes where we could place that boolean without breaking ABI. In x86_64 there's a couple of 4-byte holes, but those won't be there on x86, so not great candidates. Fortunately there are 3-byte and 7-byte holes also, which we can use safely. We can move the new boolean to those location. The holes are different in each branch unfortunately. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Use it up, wear it out, make it do, or do without"
On Thu, Nov 14, 2024 at 11:29 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > ISTM that we have spare bytes where we could place that boolean without > breaking ABI. In x86_64 there's a couple of 4-byte holes, but those > won't be there on x86, so not great candidates. Fortunately there are > 3-byte and 7-byte holes also, which we can use safely. We can move the > new boolean to those location. Wasn't this part of the official guidelines? I've been doing this all along (e.g., in commit 3fa81b62e0). > The holes are different in each branch unfortunately. Yeah, that'd make it a bit more complicated, but still doable. It would be necessary to place the same field in seemingly random locations on each backbranch. FWIW recent versions of clangd will show me information about field padding in struct annotations. I don't even have to run pahole. -- Peter Geoghegan
On Thu, Nov 14, 2024 at 05:13:35PM +0100, Peter Eisentraut wrote: > On 14.11.24 15:35, Noah Misch wrote: > > The postgr.es/c/e54a42a standard would have us stop here. But I'm open to > > treating the standard as mistaken and changing things. > > That text explicitly calls out that adding struct members at the end of a > struct is considered okay. But thinking about it now, even adding fields to > the end of a node struct that extensions allocate using makeNode() is an ABI > break Right. makeNode(), palloc(sizeof), and stack allocation have that problem. Allocation wrappers like CreateExecutorState() avoid the problem. More generally, structs allocated in non-extension code are fine.
Hi, > Right. makeNode(), palloc(sizeof), and stack allocation have that problem. > Allocation wrappers like CreateExecutorState() avoid the problem. More > generally, structs allocated in non-extension code are fine. Perhaps we should consider adding an API like makeResultRelInfo() and others, one per node. It's going to be boilerplate for sure, but considering the fact that we already generate some code I don't really see drawbacks. -- Best regards, Aleksander Alekseev
On Thu, Nov 14, 2024 at 9:43 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 14.11.24 15:35, Noah Misch wrote:
> The postgr.es/c/e54a42a standard would have us stop here. But I'm open to
> treating the standard as mistaken and changing things.
That text explicitly calls out that adding struct members at the end of
a struct is considered okay.
I think that assumption is ok when the struct is allocated by core and passed to the extension. But if the struct is allocated (static or dynamic) and passed to the core, we have to be extra careful.
But thinking about it now, even adding
fields to the end of a node struct that extensions allocate using
makeNode() is an ABI break that is liable to cause all affected
extensions to break in a crashing way.
Memory corruption issues can be quite subtle and hard to diagnose. So if wrapping a new release is an option, I would vote for it. If we do that, doing it for every impacted version would be more prudent. But I don't know what are the policies around making a new release.
Thanks,
Pavan
On Thu, Nov 14, 2024 at 05:29:18PM +0100, Christoph Berg wrote: > Re: Noah Misch > > Based on a grep of PGXN code, here are some or all of the modules that react > > to sizeof(ResultRelInfo): > > > > $ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u > > apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo)); > > Confirmed, crashing: AGE for 14..17 (12..13 seem fine) Can you share more about the crash, perhaps the following? - stack trace - any server messages just before the crash - which architecture (32-bit x86, 64-bit ARM, etc.) - asserts enabled, or not? It's not immediately to clear to me why this would crash in a non-asserts build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16, so I expect no actual writing past the end of the chunk. I don't see apacheage allocating a ResultRelInfo other than via one palloc per ResultRelInfo (no arrays of them, no stack-allocated ResultRelInfo). I'll also work on installing apacheage to get those answers locally.
On Thu, Nov 14, 2024 at 10:26:58AM -0800, Noah Misch wrote: > On Thu, Nov 14, 2024 at 05:29:18PM +0100, Christoph Berg wrote: > > Re: Noah Misch > > > Based on a grep of PGXN code, here are some or all of the modules that react > > > to sizeof(ResultRelInfo): > > > > > > $ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u > > > apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo)); > > > > Confirmed, crashing: AGE for 14..17 (12..13 seem fine) > > Can you share more about the crash, perhaps the following? > > - stack trace > - any server messages just before the crash > - which architecture (32-bit x86, 64-bit ARM, etc.) > - asserts enabled, or not? > > It's not immediately to clear to me why this would crash in a non-asserts > build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16, > so I expect no actual writing past the end of the chunk. I don't see > apacheage allocating a ResultRelInfo other than via one palloc per > ResultRelInfo (no arrays of them, no stack-allocated ResultRelInfo). I'll > also work on installing apacheage to get those answers locally. On x86_64, I ran these with and without asserts: install PostgreSQL 16.4 install https://github.com/apache/age/tree/master make -C age installcheck install PostgreSQL 16.5 make -C age installcheck The non-asserts build passed. The asserts build failed with "+WARNING: problem in alloc set ExecutorState: detected write past chunk" throughout the diffs, but there were no crashes. (Note that AGE "make installcheck" creates a temporary installation, unlike PostgreSQL "make installcheck".) What might differ in how you tested?
Noah Misch <noah@leadboat.com> writes: > It's not immediately to clear to me why this would crash in a non-asserts > build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16, > so I expect no actual writing past the end of the chunk. I'm confused too. The allocation should be big enough. The other hazard would be failing to initialize the field, but if the extension uses InitResultRelInfo then that's taken care of. regards, tom lane
Re: Noah Misch > The non-asserts build passed. The asserts build failed with "+WARNING: > problem in alloc set ExecutorState: detected write past chunk" throughout the > diffs, but there were no crashes. (Note that AGE "make installcheck" creates > a temporary installation, unlike PostgreSQL "make installcheck".) What might > differ in how you tested? The builds for sid (Debian unstable) are cassert-enabled and there the tests threw a lot of these errors. It didn't actually crash, true. 15:25:47 SELECT * FROM cypher('cypher_index', $$ MATCH(n) DETACH DELETE n $$) AS (a agtype); 15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8ae320 15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8aeab0 15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d88b730, chunk 0x55993d88d348 15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8ae320 15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8aeab0 15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d88b730, chunk 0x55993d88d348 15:25:47 a 15:25:47 --- 15:25:47 (0 rows) https://pgdgbuild.dus.dg-i.net/job/postgresql-16-age-autopkgtest/17/architecture=amd64,distribution=sid/consoleFull I did not test other distribution releases because the test run stopped after this touchstone test. Christoph
Christoph Berg <myon@debian.org> writes: > Re: Noah Misch >> The non-asserts build passed. The asserts build failed with "+WARNING: >> problem in alloc set ExecutorState: detected write past chunk" throughout the >> diffs, but there were no crashes. (Note that AGE "make installcheck" creates >> a temporary installation, unlike PostgreSQL "make installcheck".) What might >> differ in how you tested? > The builds for sid (Debian unstable) are cassert-enabled and there the > tests threw a lot of these errors. > It didn't actually crash, true. Ah. So perhaps this is a non-issue for production (non-assert) builds? That would change the urgency materially, IMO. regards, tom lane
On 2024-Nov-14, Christoph Berg wrote: > It didn't actually crash, true. But timescale did crash: https://pgdgbuild.dus.dg-i.net/job/timescaledb-autopkgtest/9/architecture=amd64,distribution=sid/console 11:59:51 @@ -34,6 +40,7 @@ 11:59:51 CREATE INDEX ON _hyper_1_3_chunk(temp2); 11:59:51 -- INSERT only will not trigger vacuum on indexes for PG13.3+ 11:59:51 UPDATE vacuum_test SET time = time + '1s'::interval, temp1 = random(), temp2 = random(); 11:59:51 --- we should see two parallel workers for each chunk 11:59:51 -VACUUM (PARALLEL 3) vacuum_test; 11:59:51 -DROP TABLE vacuum_test; 11:59:51 +server closed the connection unexpectedly 11:59:51 + This probably means the server terminated abnormally 11:59:51 + before or while processing the request. 11:59:51 +connection to server was lost -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte" (Ijon Tichy en Viajes, Stanislaw Lem)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2024-Nov-14, Christoph Berg wrote: >> It didn't actually crash, true. > But timescale did crash: So what is timescale doing differently? regards, tom lane
Hi, > The allocation should be big enough. The other > hazard would be failing to initialize the field, but if the extension > uses InitResultRelInfo then that's taken care of. > So what is timescale doing differently? I see 3 usages of makeNode(ResultRelInfo) in Timescale: - src/nodes/chunk_dispatch/chunk_insert_state.c - src/copy.c - src/ts_catalog/catalog.c In the first case it's followed by InitResultRelInfo(). In the second - by ExecInitResultRelation() in its turn calls InitResultRelInfo(). The third case is the following: ``` extern TSDLLEXPORT ResultRelInfo * ts_catalog_open_indexes(Relation heapRel) { ResultRelInfo *resultRelInfo; resultRelInfo = makeNode(ResultRelInfo); resultRelInfo->ri_RangeTableIndex = 0; /* dummy */ resultRelInfo->ri_RelationDesc = heapRel; resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */ ExecOpenIndices(resultRelInfo, false); return resultRelInfo; } ``` Where it's used from there is hard to track but it looks like this is the reason why the new field ri_needLockTagTuple is not initialized. I'll pass this piece of information to my colleagues. -- Best regards, Aleksander Alekseev
On 2024-Nov-14, Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > On 2024-Nov-14, Christoph Berg wrote: > >> It didn't actually crash, true. > > > But timescale did crash: > > So what is timescale doing differently? No idea. Maybe a stacktrace from a core would tell us more; but the jenkins task doesn't seem to have server logs or anything else to gives us more clues. There are three makeNode(ResultRelInfo), but they don't look anything out of the ordinary: C perhan: timescaledb 0$ git grep -C5 makeNode.ResultRelInfo src/copy.c- * src/copy.c- * WARNING. The dummy rangetable index is decremented by 1 (unchecked) src/copy.c- * inside `ExecConstraints` so unless you want to have a overflow, keep it src/copy.c- * above zero. See `rt_fetch` in parsetree.h. src/copy.c- */ src/copy.c: resultRelInfo = makeNode(ResultRelInfo); src/copy.c- src/copy.c-#if PG16_LT src/copy.c- ExecInitRangeTable(estate, pstate->p_rtable); src/copy.c-#else src/copy.c- Assert(pstate->p_rteperminfos != NULL); -- src/nodes/chunk_dispatch/chunk_insert_state.c-create_chunk_result_relation_info(const ChunkDispatch *dispatch, Relation rel) src/nodes/chunk_dispatch/chunk_insert_state.c-{ src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri; src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri_orig = dispatch->hypertable_result_rel_info; src/nodes/chunk_dispatch/chunk_insert_state.c- Index hyper_rti = rri_orig->ri_RangeTableIndex; src/nodes/chunk_dispatch/chunk_insert_state.c: rri = makeNode(ResultRelInfo); src/nodes/chunk_dispatch/chunk_insert_state.c- src/nodes/chunk_dispatch/chunk_insert_state.c- InitResultRelInfo(rri, rel, hyper_rti, NULL, dispatch->estate->es_instrument); src/nodes/chunk_dispatch/chunk_insert_state.c- src/nodes/chunk_dispatch/chunk_insert_state.c- /* Copy options from the main table's (hypertable's) result relation info*/ src/nodes/chunk_dispatch/chunk_insert_state.c- rri->ri_WithCheckOptions = rri_orig->ri_WithCheckOptions; -- src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo * src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel) src/ts_catalog/catalog.c-{ src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo; src/ts_catalog/catalog.c- src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo); src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* dummy */ src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel; src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */ src/ts_catalog/catalog.c- src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false); I didn't catch any other allocations that would depend on the size of ResultRelInfo in obvious ways. Oh, hmm, there's this bit which uses struct assignment into a stack element: tsl/src/compression/compression.c-1559- if (decompressor->indexstate->ri_NumIndices > 0) tsl/src/compression/compression.c-1560- { tsl/src/compression/compression.c:1561: ResultRelInfo indexstate_copy = *decompressor->indexstate; tsl/src/compression/compression.c-1562- Relation single_index_relation; I find this rather suspect. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "La espina, desde que nace, ya pincha" (Proverbio africano)
Hi Alvaro, > There are three makeNode(ResultRelInfo), but they don't look anything > out of the ordinary: > [...] > src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo * > src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel) > src/ts_catalog/catalog.c-{ > src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo; > src/ts_catalog/catalog.c- > src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo); > src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* dummy */ > src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel; > src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */ > src/ts_catalog/catalog.c- > src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false); ExecOpenIndices() will not fill the ri_needLockTagTuple field for the caller and it will contain garbage. Since you assumed this is the correct usage, maybe it should? -- Best regards, Aleksander Alekseev
Aleksander Alekseev <aleksander@timescale.com> writes: > The third case is the following: > ``` > extern TSDLLEXPORT ResultRelInfo * > ts_catalog_open_indexes(Relation heapRel) > { > ResultRelInfo *resultRelInfo; > resultRelInfo = makeNode(ResultRelInfo); > resultRelInfo->ri_RangeTableIndex = 0; /* dummy */ > resultRelInfo->ri_RelationDesc = heapRel; > resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */ > ExecOpenIndices(resultRelInfo, false); > return resultRelInfo; > } > ``` I'd call that a bug in timescale. It has no business assuming that it can skip InitResultRelInfo. regards, tom lane
On Thu, Nov 14, 2024 at 09:33:32PM +0100, Alvaro Herrera wrote: > On 2024-Nov-14, Tom Lane wrote: > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > But timescale did crash: > > > > So what is timescale doing differently? > src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo * > src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel) > src/ts_catalog/catalog.c-{ > src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo; > src/ts_catalog/catalog.c- > src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo); > src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* dummy */ > src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel; > src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */ > src/ts_catalog/catalog.c- > src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false); This is a copy of PostgreSQL's CatalogOpenIndexes(). Assuming timescaledb uses it like PostgreSQL uses CatalogOpenIndexes(), it's fine. Specifically, CatalogOpenIndexes() is fine since PostgreSQL does not pass the ResultRelInfo to ModifyTable. > Oh, hmm, there's this bit which uses struct assignment into a stack > element: Yes, stack allocation of a ResultRelInfo sounds relevant to the crash. It qualifies as a "very unusual code pattern" in the postgr.es/c/e54a42a sense. I'm hearing the only confirmed impact on non-assert builds is the need to recompile timescaledb. (It's unknown whether recompiling will suffice for timescaledb. For assert builds, six PGXN extensions need recompilation.) I don't see us issuing another set of back branch releases for the purpose of making a v16.4-built timescaledb avoid a rebuild. The target audience would be someone who can get a new PostgreSQL build but can't get a new timescaledb build. That feels like a small audience. What's missing in that analysis? Going forward, for struct field additions, I plan to make my own patches more ABI-breakage-averse than the postgr.es/c/e54a42a standard.
On Thu, Nov 14, 2024 at 5:41 PM Noah Misch <noah@leadboat.com> wrote:
I'm hearing the only confirmed impact on non-assert builds is the need to
recompile timescaledb. (It's unknown whether recompiling will suffice for
timescaledb. For assert builds, six PGXN extensions need recompilation.)
That matches what our build and test teams are seeing.
We dug into the two lines of impacted Citus code, they are just touching columnar metadata. We dragged Marco into a late night session to double check that with the Citus columnar regression tests and look for red flags in the code. In an assert Citus built against 16.4 running against PostgreSQL 16.5, he hit the assert warnings, but the tests pass and there's no signs or suspicion of a functional impact:
CREATE TABLE columnar_table_1 (a int) USING columnar;
INSERT INTO columnar_table_1 VALUES (1);
+WARNING: problem in alloc set Stripe Write Memory Context: detected write past chunk end in block 0x563ee43a4f10, chunk 0x563ee43a6240
+WARNING: problem in alloc set Stripe Write Memory Context: detected write past chunk end in block 0x563ee4369bb0, chunk 0x563ee436acb0
+WARNING: problem in alloc set Stripe Write Memory Context: detected write past chunk end in block 0x563ee4369bb0, chunk 0x563ee436b3c8
CREATE TABLE columnar_table_1 (a int) USING columnar;
INSERT INTO columnar_table_1 VALUES (1);
+WARNING: problem in alloc set Stripe Write Memory Context: detected write past chunk end in block 0x563ee43a4f10, chunk 0x563ee43a6240
+WARNING: problem in alloc set Stripe Write Memory Context: detected write past chunk end in block 0x563ee4369bb0, chunk 0x563ee436acb0
+WARNING: problem in alloc set Stripe Write Memory Context: detected write past chunk end in block 0x563ee4369bb0, chunk 0x563ee436b3c8
Thanks to everyone who's jumped in to investigate here. With the PL/Perl CVE at an 8.8, sorting out how to get that fix to everyone and navigate the breakage is very important.
--
Greg Smith, Crunchy Data
Director of Open Source Strategy
On Fri, Nov 15, 2024 at 1:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Noah Misch <noah@leadboat.com> writes:
> It's not immediately to clear to me why this would crash in a non-asserts
> build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16,
> so I expect no actual writing past the end of the chunk.
I'm confused too. The allocation should be big enough. The other
hazard would be failing to initialize the field, but if the extension
uses InitResultRelInfo then that's taken care of.
I should have mentioned in my original post that our limited PGD tests are passing too. But I wasn't sure if the problem may hit us in the field, given the subtleness of the memory corruption. But it's quite comforting to read Noah's analysis about why this could be a non-issue for non-assert builds.
Thanks,
Pavan
Hi, > > src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo * > > src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel) > > src/ts_catalog/catalog.c-{ > > src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo; > > src/ts_catalog/catalog.c- > > src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo); > > src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* dummy */ > > src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel; > > src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */ > > src/ts_catalog/catalog.c- > > src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false); > > This is a copy of PostgreSQL's CatalogOpenIndexes(). Assuming timescaledb > uses it like PostgreSQL uses CatalogOpenIndexes(), it's fine. Specifically, > CatalogOpenIndexes() is fine since PostgreSQL does not pass the ResultRelInfo > to ModifyTable. Yes. Initially I thought this was done for compatibility reasons with different PG versions, but this doesn't seem to be the case since CatalogOpenIndexes() was in the core for a long time. > > Oh, hmm, there's this bit which uses struct assignment into a stack > > element: > > Yes, stack allocation of a ResultRelInfo sounds relevant to the crash. It > qualifies as a "very unusual code pattern" in the postgr.es/c/e54a42a sense. Yes, this is another issue. I'm working with the TimescaleDB dev team to fix these issues on the TimescaleDB side. -- Best regards, Aleksander Alekseev
Hi, On Thu, Nov 14, 2024 at 11:35:25AM -0500, Peter Geoghegan wrote: > On Thu, Nov 14, 2024 at 11:29 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > ISTM that we have spare bytes where we could place that boolean without > > breaking ABI. In x86_64 there's a couple of 4-byte holes, but those > > won't be there on x86, so not great candidates. Fortunately there are > > 3-byte and 7-byte holes also, which we can use safely. We can move the > > new boolean to those location. > > Wasn't this part of the official guidelines? Do you mean in [1] (Maintaining ABI compatibility while backpatching section)? If so, it looks like it's not currently mentioned. What about? " You cannot modify any struct definition in src/include/*. If any new members must be added to a struct, put them in the padding holes (if any) in backbranches (or at the end of the struct if there is no padding holes available). " Also we may need a decision table to ensure 32-bit portability based on the hole size on 64-bit, something like? 64-bit hole size | use on 32-bit? -----------------|--------------- <=3 bytes | safe to use 4 bytes | don't use 5-7 bytes | use first (hole_size - 4) bytes only Does that make sense? Regards, [1]: https://wiki.postgresql.org/wiki/Committing_checklist -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 2024-Nov-14, Noah Misch wrote: > I'm hearing the only confirmed impact on non-assert builds is the need to > recompile timescaledb. (It's unknown whether recompiling will suffice for > timescaledb. For assert builds, six PGXN extensions need recompilation.) I > don't see us issuing another set of back branch releases for the purpose of > making a v16.4-built timescaledb avoid a rebuild. The target audience would > be someone who can get a new PostgreSQL build but can't get a new timescaledb > build. That feels like a small audience. What's missing in that analysis? I agree with your conclusion that no rewrap is needed. I previously said otherwise, based on claims that there were multiple extensions causing crashes. If the one crash we know about is because timescaledb is using an unusual coding pattern, they can fix that more easily than we can. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Los dioses no protegen a los insensatos. Éstos reciben protección de otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)
On Fri, Nov 15, 2024 at 9:57 AM Aleksander Alekseev <aleksander@timescale.com> wrote: > I'm working with the TimescaleDB dev team to fix these issues on the > TimescaleDB side. I looked a bit at this out of interest. I see an assert failure in the lines below when running tests with TimescaleDB compiled against 17.0 with 17.1 installed. Without the assertion it would anyway segfault below. /* * Usually, mt_lastResultIndex matches the target rel. If it happens not * to, we can get the index the hard way with an integer division. */ whichrel = mtstate->mt_lastResultIndex; if (resultRelInfo != mtstate->resultRelInfo + whichrel) { whichrel = resultRelInfo - mtstate->resultRelInfo; Assert(whichrel >= 0 && whichrel < mtstate->mt_nrels); } updateColnos = (List *) list_nth(node->updateColnosLists, whichrel); The problem here is that because TimescaleDB compiled against 17.0 assumes a struct size of 376 (on my laptop) while PostgreSQL allocated the array with a struct size of 384, so the pointer math no longer holds and the whichrel value becomes nonsense. (1736263376 for whatever reason) cheers, Marco
Hi Macro, > I looked a bit at this out of interest. I see an assert failure in the > lines below when running tests with TimescaleDB compiled against 17.0 > with 17.1 installed. Without the assertion it would anyway segfault > below. > > /* > * Usually, mt_lastResultIndex matches the target rel. If it happens not > * to, we can get the index the hard way with an integer division. > */ > whichrel = mtstate->mt_lastResultIndex; > if (resultRelInfo != mtstate->resultRelInfo + whichrel) > { > whichrel = resultRelInfo - mtstate->resultRelInfo; > Assert(whichrel >= 0 && whichrel < mtstate->mt_nrels); > } > > updateColnos = (List *) list_nth(node->updateColnosLists, whichrel); > > The problem here is that because TimescaleDB compiled against 17.0 > assumes a struct size of 376 (on my laptop) while PostgreSQL allocated > the array with a struct size of 384, so the pointer math no longer > holds and the whichrel value becomes nonsense. (1736263376 for > whatever reason) Thanks for reporting. Yes, the code assumed fixed sizeof(ResultRelInfo) within a given PG major release branch which turned out not to be the case. We will investigate whether it can be easily fixed on TimescaleDB side. -- Best regards, Aleksander Alekseev
On Thu, Nov 14, 2024 at 5:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 14.11.24 15:35, Noah Misch wrote:
> The postgr.es/c/e54a42a standard would have us stop here. But I'm open to
> treating the standard as mistaken and changing things.
That text explicitly calls out that adding struct members at the end of
a struct is considered okay. But thinking about it now, even adding
fields to the end of a node struct that extensions allocate using
makeNode() is an ABI break that is liable to cause all affected
extensions to break in a crashing way.
I think it was mentioned elsewhere, but this wouldn't be a problem if makeNode was not a macro.
Best wishes,
Mats Kindahl, Timescale
On Thu, Nov 14, 2024 at 9:33 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-Nov-14, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > On 2024-Nov-14, Christoph Berg wrote:
> >> It didn't actually crash, true.
>
> > But timescale did crash:
>
> So what is timescale doing differently?
No idea. Maybe a stacktrace from a core would tell us more; but the
jenkins task doesn't seem to have server logs or anything else to gives
us more clues.
Here is one stack trace from my failure. It's a little inconclusive, but maybe better brains than mine might realize something:
(gdb) bt#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ab10c44526e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ab10c4288ff in __GI_abort () at ./stdlib/abort.c:79
#5 0x000056b2b0439b70 in ExceptionalCondition (conditionName=conditionName@entry=0x7ab10cad0280 "whichrel >= 0 && whichrel < mtstate->mt_nrels",
fileName=fileName@entry=0x7ab10cad0170 "/home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c", lineNumber=lineNumber@entry=1249) at assert.c:66
#6 0x00007ab10caa0c90 in ExecInitUpdateProjection (mtstate=mtstate@entry=0x56b2b0bbf8f0, resultRelInfo=resultRelInfo@entry=0x56b2b0bbfc80) at /home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c:1249
#7 0x00007ab10caa385e in ExecModifyTable (cs_node=cs_node@entry=0x56b2b0bbf2e0, pstate=0x56b2b0bbf8f0) at /home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c:940
#8 0x00007ab10caa3a83 in hypertable_modify_exec (node=0x56b2b0bbf2e0) at /home/mats/work/timescale/timescaledb+tam/src/nodes/hypertable_modify.c:174
#9 0x000056b2b011bcba in ExecCustomScan (pstate=<optimized out>) at nodeCustom.c:121
#10 0x000056b2b0107147 in ExecProcNodeFirst (node=0x56b2b0bbf2e0) at execProcnode.c:464
#11 0x000056b2b00fed07 in ExecProcNode (node=node@entry=0x56b2b0bbf2e0) at ../../../src/include/executor/executor.h:274
#12 0x000056b2b00fed9b in ExecutePlan (estate=estate@entry=0x56b2b0bbf030, planstate=0x56b2b0bbf2e0, use_parallel_mode=<optimized out>, operation=operation@entry=CMD_UPDATE, sendTuples=sendTuples@entry=false,
numberTuples=numberTuples@entry=0, direction=ForwardScanDirection, dest=0x56b2b0ba6a68, execute_once=true) at execMain.c:1654
#13 0x000056b2b00ffb6e in standard_ExecutorRun (queryDesc=0x56b2b0b96780, direction=ForwardScanDirection, count=0, execute_once=execute_once@entry=true) at execMain.c:365
#14 0x000056b2b00ffcb2 in ExecutorRun (queryDesc=queryDesc@entry=0x56b2b0b96780, direction=direction@entry=ForwardScanDirection, count=count@entry=0, execute_once=execute_once@entry=true) at execMain.c:306
#15 0x000056b2b02f108a in ProcessQuery (plan=plan@entry=0x56b2b0ba6958, sourceText=<optimized out>, params=0x0, queryEnv=0x0, dest=dest@entry=0x56b2b0ba6a68, qc=qc@entry=0x7ffc452f04e0) at pquery.c:160
#16 0x000056b2b02f1bf2 in PortalRunMulti (portal=portal@entry=0x56b2b0af24d0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x56b2b0ba6a68, altdest=altdest@entry=0x56b2b0ba6a68,
qc=qc@entry=0x7ffc452f04e0) at pquery.c:1277
#17 0x000056b2b02f210f in PortalRun (portal=portal@entry=0x56b2b0af24d0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x56b2b0ba6a68,
altdest=altdest@entry=0x56b2b0ba6a68, qc=0x7ffc452f04e0) at pquery.c:791
#18 0x000056b2b02ee17d in exec_simple_query (query_string=query_string@entry=0x56b2b0a5c9e0 "update ts_continuous_test SET val = 5 where time > 15 and time < 25;") at postgres.c:1278
#19 0x000056b2b02f01a7 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4767
#20 0x000056b2b02e959e in BackendMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at backend_startup.c:105
#21 0x000056b2b023f955 in postmaster_child_launch (child_type=child_type@entry=B_BACKEND, startup_data=startup_data@entry=0x7ffc452f0714 "", startup_data_len=startup_data_len@entry=4, client_sock=client_sock@entry=0x7ffc452f0750)
at launch_backend.c:277
#22 0x000056b2b024411e in BackendStartup (client_sock=client_sock@entry=0x7ffc452f0750) at postmaster.c:3593
#23 0x000056b2b02443a1 in ServerLoop () at postmaster.c:1674
#24 0x000056b2b0245a15 in PostmasterMain (argc=argc@entry=8, argv=argv@entry=0x56b2b0a16860) at postmaster.c:1372
#25 0x000056b2b0161207 in main (argc=8, argv=0x56b2b0a16860) at main.c:197
There are three makeNode(ResultRelInfo), but they don't look anything
out of the ordinary:
C perhan: timescaledb 0$ git grep -C5 makeNode.ResultRelInfo
src/copy.c- *
src/copy.c- * WARNING. The dummy rangetable index is decremented by 1 (unchecked)
src/copy.c- * inside `ExecConstraints` so unless you want to have a overflow, keep it
src/copy.c- * above zero. See `rt_fetch` in parsetree.h.
src/copy.c- */
src/copy.c: resultRelInfo = makeNode(ResultRelInfo);
src/copy.c-
src/copy.c-#if PG16_LT
src/copy.c- ExecInitRangeTable(estate, pstate->p_rtable);
src/copy.c-#else
src/copy.c- Assert(pstate->p_rteperminfos != NULL);
--
src/nodes/chunk_dispatch/chunk_insert_state.c-create_chunk_result_relation_info(const ChunkDispatch *dispatch, Relation rel)
src/nodes/chunk_dispatch/chunk_insert_state.c-{
src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri;
src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri_orig = dispatch->hypertable_result_rel_info;
src/nodes/chunk_dispatch/chunk_insert_state.c- Index hyper_rti = rri_orig->ri_RangeTableIndex;
src/nodes/chunk_dispatch/chunk_insert_state.c: rri = makeNode(ResultRelInfo);
src/nodes/chunk_dispatch/chunk_insert_state.c-
src/nodes/chunk_dispatch/chunk_insert_state.c- InitResultRelInfo(rri, rel, hyper_rti, NULL, dispatch->estate->es_instrument);
src/nodes/chunk_dispatch/chunk_insert_state.c-
src/nodes/chunk_dispatch/chunk_insert_state.c- /* Copy options from the main table's (hypertable's) result relation info */
src/nodes/chunk_dispatch/chunk_insert_state.c- rri->ri_WithCheckOptions = rri_orig->ri_WithCheckOptions;
--
src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo *
src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel)
src/ts_catalog/catalog.c-{
src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo;
src/ts_catalog/catalog.c-
src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo);
src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* dummy */
src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel;
src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */
src/ts_catalog/catalog.c-
src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false);
I didn't catch any other allocations that would depend on the size of
ResultRelInfo in obvious ways.
Oh, hmm, there's this bit which uses struct assignment into a stack
element:
tsl/src/compression/compression.c-1559- if (decompressor->indexstate->ri_NumIndices > 0)
tsl/src/compression/compression.c-1560- {
tsl/src/compression/compression.c:1561: ResultRelInfo indexstate_copy = *decompressor->indexstate;
tsl/src/compression/compression.c-1562- Relation single_index_relation;
I find this rather suspect.
Yes, I agree, trying to figure out other ways to deal with this piece of code.
It is not possible to make a copy of the ResultRelInfo node here since copyObject() does not support T_ResultRelInfo. Trying to use CatalogOpenIndexes() causes another warning.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"La espina, desde que nace, ya pincha" (Proverbio africano)
Best wishes,
Mats Kindahl, Timescale
On Thu, Nov 14, 2024 at 11:41 PM Noah Misch <noah@leadboat.com> wrote:
On Thu, Nov 14, 2024 at 09:33:32PM +0100, Alvaro Herrera wrote:
> On 2024-Nov-14, Tom Lane wrote:
> > Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > > But timescale did crash:
> >
> > So what is timescale doing differently?
> src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo *
> src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel)
> src/ts_catalog/catalog.c-{
> src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo;
> src/ts_catalog/catalog.c-
> src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo);
> src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* dummy */
> src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel;
> src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */
> src/ts_catalog/catalog.c-
> src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false);
This is a copy of PostgreSQL's CatalogOpenIndexes(). Assuming timescaledb
uses it like PostgreSQL uses CatalogOpenIndexes(), it's fine. Specifically,
CatalogOpenIndexes() is fine since PostgreSQL does not pass the ResultRelInfo
to ModifyTable.
This seems to be old code, so I have replaced it with CatalogOpenIndexes (and friends) and it seems to work fine.
> Oh, hmm, there's this bit which uses struct assignment into a stack
> element:
Yes, stack allocation of a ResultRelInfo sounds relevant to the crash. It
qualifies as a "very unusual code pattern" in the postgr.es/c/e54a42a sense.
This is a lesson for us to make sure that we have better checks for these kinds of situations. We should be able to deal with the extension by allocating, e.g., twice the needed memory for the struct (and make sure to zero it out). Then there is no risk of stepping on other object's data.
I'm hearing the only confirmed impact on non-assert builds is the need to
recompile timescaledb. (It's unknown whether recompiling will suffice for
timescaledb. For assert builds, six PGXN extensions need recompilation.)
I re-compiled and ran with the same version on the server and extension and that seems to work fine. Nothing that stands out. (We have a few other things that behave strange, so might have to take that back.)
I don't see us issuing another set of back branch releases for the purpose of
making a v16.4-built timescaledb avoid a rebuild. The target audience would
be someone who can get a new PostgreSQL build but can't get a new timescaledb
build. That feels like a small audience. What's missing in that analysis?
Going forward, for struct field additions, I plan to make my own patches more
ABI-breakage-averse than the postgr.es/c/e54a42a standard.
Best wishes,
Mats Kindahl, Timescale
On Fri, Nov 15, 2024 at 12:51 PM Marco Slot <marco.slot@gmail.com> wrote:
On Fri, Nov 15, 2024 at 9:57 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:
> I'm working with the TimescaleDB dev team to fix these issues on the
> TimescaleDB side.
I looked a bit at this out of interest. I see an assert failure in the
lines below when running tests with TimescaleDB compiled against 17.0
with 17.1 installed. Without the assertion it would anyway segfault
below.
/*
* Usually, mt_lastResultIndex matches the target rel. If it happens not
* to, we can get the index the hard way with an integer division.
*/
whichrel = mtstate->mt_lastResultIndex;
if (resultRelInfo != mtstate->resultRelInfo + whichrel)
{
whichrel = resultRelInfo - mtstate->resultRelInfo;
Assert(whichrel >= 0 && whichrel < mtstate->mt_nrels);
}
updateColnos = (List *) list_nth(node->updateColnosLists, whichrel);
The problem here is that because TimescaleDB compiled against 17.0
assumes a struct size of 376 (on my laptop) while PostgreSQL allocated
the array with a struct size of 384, so the pointer math no longer
holds and the whichrel value becomes nonsense. (1736263376 for
whatever reason)
That was one of the failures that we had, but we also have a copy of the ExecModifyTable() function and this line would in that case be a problem:
607
608 /* Preload local variables */
609 resultRelInfo = node->resultRelInfo + node->mt_lastResultIndex;
610 subplanstate = outerPlanState(node);
611
--
Best wishes,
Mats Kindahl, Timescale
Re: Noah Misch > I'm hearing the only confirmed impact on non-assert builds is the need to > recompile timescaledb. (It's unknown whether recompiling will suffice for > timescaledb. For assert builds, six PGXN extensions need recompilation.) Which 6 extensions are these? I re-ran the tests on all the extensions on apt.pg.o and didn't see any failures (timescale and age are already rebuilt). Christoph
On Fri, Nov 15, 2024 at 02:45:42PM +0100, Christoph Berg wrote: > Re: Noah Misch > > I'm hearing the only confirmed impact on non-assert builds is the need to > > recompile timescaledb. (It's unknown whether recompiling will suffice for > > timescaledb. For assert builds, six PGXN extensions need recompilation.) > > Which 6 extensions are these? > > I re-ran the tests on all the extensions on apt.pg.o and didn't see > any failures (timescale and age are already rebuilt). I counted wrong, it's the five here: https://postgr.es/m/20241114143524.91.nmisch@google.com They are AGE and the four others you found not to be part of apt.pg.o. (timescaledb is not part of PGXN, so it's not one of the five.)
Aleksander Alekseev <aleksander@timescale.com> writes: > Hi Macro, >> The problem here is that because TimescaleDB compiled against 17.0 >> assumes a struct size of 376 (on my laptop) while PostgreSQL allocated >> the array with a struct size of 384, so the pointer math no longer >> holds and the whichrel value becomes nonsense. (1736263376 for >> whatever reason) > Thanks for reporting. Yes, the code assumed fixed > sizeof(ResultRelInfo) within a given PG major release branch which > turned out not to be the case. We will investigate whether it can be > easily fixed on TimescaleDB side. Yeah, the array-stride problem seems extremely hard to work around, because whichever size it is, you can't get code compiled with the other size to work. I believe ResultRelInfo is the only node type we use arrays of, so this was a particularly unfortunate place to break ABI, but there it is. I'm starting to lean to the opinion that we need a re-wrap. Given that padding holes exist, the code changes shouldn't be big. regards, tom lane
On Fri, Nov 15, 2024 at 10:09:54AM -0500, Tom Lane wrote: > Aleksander Alekseev <aleksander@timescale.com> writes: > > Hi Macro, > >> The problem here is that because TimescaleDB compiled against 17.0 > >> assumes a struct size of 376 (on my laptop) while PostgreSQL allocated > >> the array with a struct size of 384, so the pointer math no longer > >> holds and the whichrel value becomes nonsense. (1736263376 for > >> whatever reason) > > > Thanks for reporting. Yes, the code assumed fixed > > sizeof(ResultRelInfo) within a given PG major release branch which > > turned out not to be the case. We will investigate whether it can be > > easily fixed on TimescaleDB side. > > Yeah, the array-stride problem seems extremely hard to work around, > because whichever size it is, you can't get code compiled with the > other size to work. I believe ResultRelInfo is the only node type > we use arrays of, so this was a particularly unfortunate place > to break ABI, but there it is. I see ModifyTableState.resultRelInfo; are there other known ResultRelInfo arrays? That does add firebird_fdw to the list of PGXN modules requiring rebuilds: $ grep -rE 'resultRelInfo(\[|.* \+ )' | tee /tmp/3 | sed 's/-[^:]*/:/'|sort -u firebird_fdw:: resultRelation = mtstate->resultRelInfo[0].ri_RangeTableIndex; firebird_fdw:: resultRelInfo > mtstate->resultRelInfo + mtstate->mt_whichplan) > I'm starting to lean to the opinion that we need a re-wrap. Perhaps. Even if we do rewrap for some reason, it's not a given that restoring the old struct size is net beneficial. If we restore the old struct size in v16.6, those who rebuild for v16.5 would need to rebuild again. Hearing about other ResultRelInfo arrays will help clarify that decision. Either way, users of timescaledb should rebuild timescaledb for every future PostgreSQL minor release.
On Fri, Nov 15, 2024 at 11:22 PM Noah Misch <noah@leadboat.com> wrote:
> I'm starting to lean to the opinion that we need a re-wrap.
Perhaps. Even if we do rewrap for some reason, it's not a given that
restoring the old struct size is net beneficial. If we restore the old struct
size in v16.6, those who rebuild for v16.5 would need to rebuild again.
Hearing about other ResultRelInfo arrays will help clarify that decision.
Looking more carefully at the usage of `ResultRelInfo` in the PGD code, I think we might also be impacted by it. At one place, we loop through the `es_result_relations` array and a size mismatch there will cause problems. Interestingly, in v14 and above, we read from `es_opened_result_relations`, which is a List, so it should be safe. I will try some tests on v13 to see if they result in crashes. But it seems quite likely by reading the code.
Thanks,
Pavan
On Nov 15, 2024, at 12:52, Noah Misch <noah@leadboat.com> wrote: > Either way, users of timescaledb should rebuild timescaledb for every future > PostgreSQL minor release. Ugh, was really hoping to get to a place where we could avoid requiring rebuilds of any extension for every minor release.:-( We even added ABI guidance to the docs[1] about this. Best, David [1]: https://commitfest.postgresql.org/48/5080/
On Fri, Nov 15, 2024 at 11:39:24PM +0530, Pavan Deolasee wrote: > On Fri, Nov 15, 2024 at 11:22 PM Noah Misch <noah@leadboat.com> wrote: > > > I'm starting to lean to the opinion that we need a re-wrap. > > > > Perhaps. Even if we do rewrap for some reason, it's not a given that > > restoring the old struct size is net beneficial. If we restore the old > > struct > > size in v16.6, those who rebuild for v16.5 would need to rebuild again. > > Hearing about other ResultRelInfo arrays will help clarify that decision. > > Looking more carefully at the usage of `ResultRelInfo` in the PGD code, I > think we might also be impacted by it. At one place, we loop through the > `es_result_relations` array and a size mismatch there will cause problems. > Interestingly, in v14 and above, we read from `es_opened_result_relations`, > which is a List, so it should be safe. I will try some tests on v13 to see > if they result in crashes. But it seems quite likely by reading the code. I agree. I'm investigating the scope of damage from es_result_relations.
On Fri, Nov 15, 2024 at 11:39 PM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
Looking more carefully at the usage of `ResultRelInfo` in the PGD code, I think we might also be impacted by it. At one place, we loop through the `es_result_relations` array and a size mismatch there will cause problems. Interestingly, in v14 and above, we read from `es_opened_result_relations`, which is a List, so it should be safe. I will try some tests on v13 to see if they result in crashes. But it seems quite likely by reading the code.
Ah, the addition of a member to `ResultRelInfo` did not happen in v12 and v13, even though the commit was backpatched all the way to v12. Maybe we (PGD) got twice lucky :-) There could be other extensions which might be looping through `es_result_relations` though and get impacted.
Thanks,
Pavan
On Sat, Nov 16, 2024 at 12:10:06AM +0530, Pavan Deolasee wrote: > On Fri, Nov 15, 2024 at 11:39 PM Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > > Looking more carefully at the usage of `ResultRelInfo` in the PGD code, I > > think we might also be impacted by it. At one place, we loop through the > > `es_result_relations` array and a size mismatch there will cause problems. > > Interestingly, in v14 and above, we read from `es_opened_result_relations`, > > which is a List, so it should be safe. I will try some tests on v13 to see > > if they result in crashes. But it seems quite likely by reading the code. > > > Ah, the addition of a member to `ResultRelInfo` did not happen in v12 and > v13, even though the commit was backpatched all the way to v12. Maybe we True. > (PGD) got twice lucky :-) There could be other extensions which might be > looping through `es_result_relations` though and get impacted. Like you say, trouble with es_result_relations would be a v12/v13 phenomenon, and v12/v13 ABI didn't change. If the v12/v13 ABI had changed here, that would have moved pg_pathman from the "rebuild if using asserts" category to the "rebuild unconditionally" category, due to this code: pg_pathman:: estate->es_num_result_relations * sizeof(ResultRelInfo)); pg_pathman:: estate->es_result_relations[estate->es_num_result_relations] = *rri; pg_pathman:: if (result_rels_allocated <= estate->es_num_result_relations) pg_pathman:: return estate->es_num_result_relations++; Since the v12/v13 ABI didn't change, pg_pathman remains in the "rebuild if using asserts" category.
Noah Misch <noah@leadboat.com> writes: > On Fri, Nov 15, 2024 at 10:09:54AM -0500, Tom Lane wrote: >> I'm starting to lean to the opinion that we need a re-wrap. > Perhaps. Even if we do rewrap for some reason, it's not a given that > restoring the old struct size is net beneficial. If we restore the old struct > size in v16.6, those who rebuild for v16.5 would need to rebuild again. I think what we should say is "sorry, 16.5 is broken for use with these extensions, use another minor version". If we don't undo the struct size change then 16.5 is effectively a major version update for affected extensions: they cannot build a binary release that works with both older and newer minor releases. That's a packaging disaster, especially if it impacts more than timescale. The more so if more than one release branch is affected. > Either way, users of timescaledb should rebuild timescaledb for every future > PostgreSQL minor release. We really don't want that either. I recall that somebody (Peter E?) had been looking into tools for automatically checking ABI compatibility in stable branches. My takeaway from this mess is that we need to move the priority of that project way up. regards, tom lane
After some quick checking with "git diff", I can confirm that there is no ABI break for ResultRelInfo in v12 or v13. To fix it in v14-v17, we could do this: diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index 418c81f4be..17b0ec5138 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -484,6 +484,9 @@ typedef struct ResultRelInfo /* Have the projection and the slots above been initialized? */ bool ri_projectNewInfoValid; + /* updates do LockTuple() before oldtup read; see README.tuplock */ + bool ri_needLockTagTuple; + /* triggers to be fired, if any */ TriggerDesc *ri_TrigDesc; @@ -592,9 +595,6 @@ typedef struct ResultRelInfo * one of its ancestors; see ExecCrossPartitionUpdateForeignKey(). */ List *ri_ancestorResultRels; - - /* updates do LockTuple() before oldtup read; see README.tuplock */ - bool ri_needLockTagTuple; } ResultRelInfo; /* ---------------- (The next-to-last field varies across branches, but this general idea will work.) In other words, put ri_needLockTagTuple in the same place it is in HEAD. In other words, our current guidelines for preserving ABI compatibility actually *created* this disaster, because the HEAD change was fine from an ABI standpoint but what was done in back branches was not. So we do need to rethink how that's worded. regards, tom lane
On Fri, Nov 15, 2024 at 03:29:21PM -0500, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > On Fri, Nov 15, 2024 at 10:09:54AM -0500, Tom Lane wrote: > >> I'm starting to lean to the opinion that we need a re-wrap. > > > Perhaps. Even if we do rewrap for some reason, it's not a given that > > restoring the old struct size is net beneficial. If we restore the old struct > > size in v16.6, those who rebuild for v16.5 would need to rebuild again. > > I think what we should say is "sorry, 16.5 is broken for use with > these extensions, use another minor version". If we don't undo the > struct size change then 16.5 is effectively a major version update for > affected extensions: they cannot build a binary release that works > with both older and newer minor releases. That's a packaging > disaster, especially if it impacts more than timescale. The more > so if more than one release branch is affected. Currently, we have Christoph Berg writing "I'd say the ship has sailed, a new release would now break things the other way round." and you writing in favor of undoing. It think it boils down to whether you want N people to recompile twice or M>N people to recompile once, where we don't know N or M except that M > N. Fortunately, the N are probably fairly well represented in this thread. So to all: please speak up by 2024-11-16T17:00+0000 if you think it's the wrong choice to bring back the v16.4 ABI and tell people to rebuild extensions built against v16.5 (likewise for corresponding versions of v14-v17). Currently, the plan of record is to do that.
Noah Misch <noah@leadboat.com> writes: > Currently, we have Christoph Berg writing "I'd say the ship has sailed, a new > release would now break things the other way round." and you writing in favor > of undoing. It think it boils down to whether you want N people to recompile > twice or M>N people to recompile once, where we don't know N or M except that > M > N. Fortunately, the N are probably fairly well represented in this > thread. So to all: please speak up by 2024-11-16T17:00+0000 if you think it's > the wrong choice to bring back the v16.4 ABI and tell people to rebuild > extensions built against v16.5 (likewise for corresponding versions of > v14-v17). Currently, the plan of record is to do that. Well, no, what I propose is for some number of people to not recompile extensions at all. Only the sort of early adopters who install a new PG release on day one will have run into this, so I anticipate that that number will be much larger than the number who have already done a rebuild. regards, tom lane
On Fri, Nov 15, 2024 at 05:37:01PM -0500, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > Currently, we have Christoph Berg writing "I'd say the ship has sailed, a new > > release would now break things the other way round." and you writing in favor > > of undoing. It think it boils down to whether you want N people to recompile > > twice or M>N people to recompile once, where we don't know N or M except that > > M > N. Fortunately, the N are probably fairly well represented in this > > thread. So to all: please speak up by 2024-11-16T17:00+0000 if you think it's > > the wrong choice to bring back the v16.4 ABI and tell people to rebuild > > extensions built against v16.5 (likewise for corresponding versions of > > v14-v17). Currently, the plan of record is to do that. > > Well, no, what I propose is for some number of people to not recompile > extensions at all. Only the sort of early adopters who install a new > PG release on day one will have run into this, so I anticipate that > that number will be much larger than the number who have already done > a rebuild. If you're saying that undoing the ABI break would allow the M builders to rebuild zero times, I agree. In other words, here are the number of rebuilds by builder group: If we undo, group N: 2 rebuilds If we undo, group M: 0 rebuilds If we don't undo, group N: 1 rebuild If we don't undo, group M: 1 rebuild I think you're predicting that M is much larger than N. That may be so. A builder will be in group N if they publish or consume 2024-11-14 builds for any amount of time. A builder is in group M if they skip the 2024-11-14 release entirely. Note that people getting binaries from someone else don't increase the size of N or M; builders of binaries (both packagers and self-packaging users) are the populations to count here. In this thread, Christoph Berg and Gregory Smith have asserted membership in that group N. We can indeed advise builders to wait in group M instead of joining group N, but N is already nonempty. Also, some builders and users of their builds can't afford to wait. That's why I described it the way I did. I'm no longer arguing against the undo, but I'm trying to explain the consequences rigorously.
On Nov 15, 2024, at 16:13, Tom Lane <tgl@sss.pgh.pa.us> wrote: > In other words, our current guidelines > for preserving ABI compatibility actually *created* this disaster, > because the HEAD change was fine from an ABI standpoint but what > was done in back branches was not. So we do need to rethink how > that's worded. What bit is mis-worded? The guidance Peter committed[1] says that “PostgreSQL makes an effort to avoid server ABI breaks in minor releases.” It sounds to me like that effort wasn’t held up in back-branches, the sources for minor releases. But maybe you had some other guidance in mind? D [1]: https://github.com/postgres/postgres/commit/e54a42a
"David E. Wheeler" <david@justatheory.com> writes: > On Nov 15, 2024, at 16:13, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In other words, our current guidelines >> for preserving ABI compatibility actually *created* this disaster, >> because the HEAD change was fine from an ABI standpoint but what >> was done in back branches was not. So we do need to rethink how >> that's worded. > What bit is mis-worded? The guidance Peter committed[1] says that “PostgreSQL makes an effort to avoid server > ABI breaks in minor releases.” That text says exactly nothing about what specific code changes to make or not make. I'm not sure offhand where (or if) we have this documented, but there's an idea that adding fields at the end of a struct is safer ABI-wise than putting them in the middle. Which is true if you can't squeeze them into padding space. Here, that could have been done and probably should have. The other bit of documentation we probably need is some annotation in struct ResultRelInfo saying "do not change the sizeof() this struct in back branches, period". regards, tom lane
Re: Noah Misch > I'm no longer arguing against the undo, but I'm trying to explain the > consequences rigorously. I'll just have to explain what happened here to the Debian security team who just released the 16.5 update. We can do 16.6 on top of that. Christoph
On Nov 15, 2024, at 19:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: > That text says exactly nothing about what specific code changes to > make or not make. I'm not sure offhand where (or if) we have this > documented, but there's an idea that adding fields at the end of > a struct is safer ABI-wise than putting them in the middle. Which > is true if you can't squeeze them into padding space. Here, that > could have been done and probably should have. > > The other bit of documentation we probably need is some annotation in > struct ResultRelInfo saying "do not change the sizeof() this struct > in back branches, period”. This sounds like complementary documentation for committers; totally agree the guidance should be written down somewhere. D
Christoph Berg <myon@debian.org> writes: > Re: Noah Misch >> I'm no longer arguing against the undo, but I'm trying to explain the >> consequences rigorously. > I'll just have to explain what happened here to the Debian security > team who just released the 16.5 update. We can do 16.6 on top of that. Fixes are pushed, and we'll wrap new releases next week on the usual release schedule. regards, tom lane
[ getting back to the document-ABI-breakage-rules-better topic ... ] I wrote: > That text says exactly nothing about what specific code changes to > make or not make. I'm not sure offhand where (or if) we have this > documented, but there's an idea that adding fields at the end of > a struct is safer ABI-wise than putting them in the middle. Which > is true if you can't squeeze them into padding space. Here, that > could have been done and probably should have. I remembered where that's documented: https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching The current text there is: * You can only really change the signature of a function with local linkage, perhaps with a few rare exceptions. * You cannot modify any struct definition in src/include/*. If any new members must be added to a struct, put them at the end in backbranches. It's okay to have a different struct layout in master. Even then, extensions that allocate the struct can break via a dependency on its size. * Move new enum values to the end. I propose rewriting and expanding that: * Don't change the signature of non-static functions. One common workaround is to change the existing function into a wrapper that calls a new function with more/different arguments. * Don't change the contents of globally-visible structs, specifically not the offsets of existing fields. If you must add a new field, the very best way is to put it into existing alignment padding between fields. (But consider both 32-bit and 64-bit cases when deciding what is "padding".) If that's not possible, it may be okay to add the field at the end of the struct; but you have to worry about whether any extensions depend on the size of the struct. This will not work well if extensions allocate new instances of the struct. One particular gotcha is that it's not okay to change sizeof(ResultRelInfo), as there are executor APIs that require extensions to index into arrays of ResultRelInfos. * Add new enum entries at the end of the enum's list, so that existing entries don't change value. * These rules only apply to released branches. In master, or in a new branch that has not yet reached RC status, it's better to place new fields and enum values in natural positions. Changing function signatures is fine too, unless there are so many callers that a compatibility wrapper seems advisable. > The other bit of documentation we probably need is some annotation in > struct ResultRelInfo saying "do not change the sizeof() this struct > in back branches, period". Something like /* * DO NOT change sizeof(ResultRelInfo) in a released branch. This * generally makes it unsafe to add fields here in a released branch. */ at the end of struct ResultRelInfo's definition. None of this is a substitute for installing some kind of ABI-checking infrastructure; but that project doesn't seem to be moving fast. regards, tom lane
Hi, On Mon, Nov 25, 2024 at 08:56:50PM -0500, Tom Lane wrote: > [ getting back to the document-ABI-breakage-rules-better topic ... ] > > I wrote: > > That text says exactly nothing about what specific code changes to > > make or not make. I'm not sure offhand where (or if) we have this > > documented, but there's an idea that adding fields at the end of > > a struct is safer ABI-wise than putting them in the middle. Which > > is true if you can't squeeze them into padding space. Here, that > > could have been done and probably should have. > > I remembered where that's documented: > > https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching > > I propose rewriting and expanding that: > > * Don't change the contents of globally-visible structs, specifically > not the offsets of existing fields. If you must add a new field, > the very best way is to put it into existing alignment padding > between fields. (But consider both 32-bit and 64-bit cases when > deciding what is "padding".) What about providing a decision table to help considering for 32-bit, something like (proposed in [1])? 64-bit hole size | use on 32-bit? -----------------|--------------- <=3 bytes | safe to use 4 bytes | don't use 5-7 bytes | use first (hole_size - 4) bytes only [1]: https://www.postgresql.org/message-id/flat/ZzcR%2BoQmUOIm6RVF%40ip-10-97-1-34.eu-west-3.compute.internal#08182ae6a6719632acf52fe4d90e9778 Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, 26 Nov 2024 at 02:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > [ getting back to the document-ABI-breakage-rules-better topic ... ] > > I wrote: > > That text says exactly nothing about what specific code changes to > > make or not make. I'm not sure offhand where (or if) we have this > > documented, but there's an idea that adding fields at the end of > > a struct is safer ABI-wise than putting them in the middle. Which > > is true if you can't squeeze them into padding space. Here, that > > could have been done and probably should have. > > I remembered where that's documented: > > https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching > > The current text there is: > > > * You can only really change the signature of a function with local > linkage, perhaps with a few rare exceptions. > > * You cannot modify any struct definition in src/include/*. If any new > members must be added to a struct, put them at the end in > backbranches. It's okay to have a different struct layout in > master. Even then, extensions that allocate the struct can break via > a dependency on its size. > > * Move new enum values to the end. > > > I propose rewriting and expanding that: > > > * Don't change the signature of non-static functions. One common > workaround is to change the existing function into a wrapper that > calls a new function with more/different arguments. > > * Don't change the contents of globally-visible structs, specifically > not the offsets of existing fields. If you must add a new field, > the very best way is to put it into existing alignment padding > between fields. (But consider both 32-bit and 64-bit cases when > deciding what is "padding".) If that's not possible, it may be okay > to add the field at the end of the struct; but you have to worry > about whether any extensions depend on the size of the struct. > This will not work well if extensions allocate new instances of the > struct. One particular gotcha is that it's not okay to change > sizeof(ResultRelInfo), as there are executor APIs that require > extensions to index into arrays of ResultRelInfos. I would also add something like the following, to protect against corruption and deserialization errors of the catalog pg_node_tree fields, because right now the generated readfuncs.c code ignores field names, assumes ABI field order, and is generally quite sensitive to binary changes, which can cause corruption and/or errors during readback of those nodes: * If you update the contents of Nodes which are stored on disk as pg_node_tree, you also have to make sure that the read function for that node type is able to read both the old and the new serialized format. A reference to readfuncs.c/readfuncs.c, and/or the usage of pg_node_tree for (among others) views, index/default expressions, constraints, and partition bounds would maybe be useful as well. > None of this is a substitute for installing some kind of ABI-checking > infrastructure; Agreed. Kind regards, Matthias van de Meent Neon (https://neon.tech)
On Nov 25, 2024, at 20:56, Tom Lane <tgl@sss.pgh.pa.us> wrote: > None of this is a substitute for installing some kind of ABI-checking > infrastructure; but that project doesn't seem to be moving fast. These sound great, very useful. I wonder if the guideline docs Peter added should link to these items (and back). D
Bertrand Drouvot <bertranddrouvot.pg@gmail.com> writes: > On Mon, Nov 25, 2024 at 08:56:50PM -0500, Tom Lane wrote: >> ... (But consider both 32-bit and 64-bit cases when >> deciding what is "padding".) > What about providing a decision table to help considering for 32-bit, something > like (proposed in [1])? > 64-bit hole size | use on 32-bit? > -----------------|--------------- > <=3 bytes | safe to use > 4 bytes | don't use > 5-7 bytes | use first (hole_size - 4) bytes only Mumble ... that seems too simplistic to me. Admittedly, I can't offhand think of an inter-platform variation that would move field offsets by something other than a multiple of 4 bytes, so maybe it's correct. In any case, I think the purpose of these notes is just to remind people of issues to think about, not to provide complete solution recipes, so I'd rather not go into that much detail. regards, tom lane
Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > I would also add something like the following, to protect against > corruption and deserialization errors of the catalog pg_node_tree > fields, because right now the generated readfuncs.c code ignores field > names, assumes ABI field order, and is generally quite sensitive to > binary changes, which can cause corruption and/or errors during > readback of those nodes: > * If you update the contents of Nodes which are stored on disk as > pg_node_tree, you also have to make sure that the read function for > that node type is able to read both the old and the new serialized > format. No, actually that policy is "You cannot change the contents of Node structs at all if they could appear in stored views, rules, default expressions, etc". We don't do catversion bumps in back branches, which means that catalog contents have to be backwards as well as forwards compatible across minor releases. That's not really something that belongs under the heading of ABI breaks, I think. It's about "the on-disk representation of catalogs is frozen within a release series", which extends to more things than Nodes. Still, maybe a parenthetical remark wouldn't hurt. Perhaps: (Keep in mind also that Node structs can't be changed at all if they might appear in stored views, rules, etc.) regards, tom lane
On Mon, Nov 25, 2024 at 08:56:50PM -0500, Tom Lane wrote: > [ getting back to the document-ABI-breakage-rules-better topic ... ] > > I wrote: > > That text says exactly nothing about what specific code changes to > > make or not make. I'm not sure offhand where (or if) we have this > > documented, but there's an idea that adding fields at the end of > > a struct is safer ABI-wise than putting them in the middle. Which > > is true if you can't squeeze them into padding space. Here, that > > could have been done and probably should have. > > I remembered where that's documented: > > https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching I'd say the source tree's <sect2 id="xfunc-api-abi-stability-guidance">, which David Wheeler mentioned, is authoritative. It currently matches postgr.es/c/e54a42a. Let's update both or change that wiki heading to point to the authoritative doc section. > I propose rewriting and expanding that: > > > * Don't change the signature of non-static functions. One common > workaround is to change the existing function into a wrapper that > calls a new function with more/different arguments. Sounds fine. > * Don't change the contents of globally-visible structs, specifically > not the offsets of existing fields. If you must add a new field, > the very best way is to put it into existing alignment padding > between fields. (But consider both 32-bit and 64-bit cases when > deciding what is "padding".) If that's not possible, it may be okay > to add the field at the end of the struct; but you have to worry > about whether any extensions depend on the size of the struct. > This will not work well if extensions allocate new instances of the > struct. One particular gotcha is that it's not okay to change > sizeof(ResultRelInfo), as there are executor APIs that require > extensions to index into arrays of ResultRelInfos. We should be able to avoid ResultRelInfo length changes, since there are other executor structs to choose from. I think this thread implies an open question about assert builds. Whenever we add to the end of a struct that extensions heap-allocate (makeNode() or otherwise), assert builds will get "write past chunk end" warnings (postgr.es/m/ZzZVgkqveG0sVGww@msg.df7cb.de). Should we say that (a) assert builds need to rebuild with every minor release, that (b) PostgreSQL will treat this like it treats non-assert rebuild requirements, or something else? If (a), we committers also need to confirm that each lengthening of an extension-allocated struct doesn't change the resulting palloc chunk size. Corresponding doc paragraphs: <para> When a change <emphasis>is</emphasis> required, <productname>PostgreSQL</productname> will choose the least invasive change possible, for example by squeezing a new field into padding space or appending it to the end of a struct. These sorts of changes should not impact extensions unless they use very unusual code patterns. </para> <para> In rare cases, however, even such non-invasive changes may be impractical or impossible. In such an event, the change will be carefully managed, taking the requirements of extensions into account. Such changes will also be documented in the release notes (<xref linkend="release"/>). </para> I think those paragraphs aren't practical for packager needs, or they leave too much undefined. There's no available algorithm for classifying extensions as "unusual". timescaledb is objectively unusual, being the one extension known to break in response to a ResultRelInfo size change in v17. That observation requires knowing what will change, so it wouldn't have helped a packager classify in advance. For non-unusual extensions, the second paragraph implies packagers should check release notes on Monday, being ready to rebuild by Thursday. Assuming nobody wants to create a rebuild procedure in 3 days, the packager will have a procedure already. If they have a rebuild procedure, they're better off rebuilding every time. That way, they don't need to scrutinize the release notes under time pressure, and they're protected even if hackers overlook the incompatibility. Those are the incentives our docs create today, right or wrong. I gather we want packagers to feel comfortable not rebuilding every time. One way to achieve that could be text like: If a change is suspected to require extension rebuilds in well-known extensions, a release at least 3 months before the breaking release will include "<specific phrase>" in its release notes. Then packagers who are taking the risk of not rebuilding every time will have 3 months to prepare, not the 3 days we're currently giving. The point about "well-known extensions" is based on my practice of grepping PGXN. That would not have found timescaledb. Should we name PGXN explicitly, or should we be vague like that draft? I'd be comfortable naming any number of repositories that make it equally easy to bulk-download the whole repository. How else might we make it safe for packagers to skip rebuilding extensions for the majority of minor releases? > * Add new enum entries at the end of the enum's list, so that existing > entries don't change value. > > * These rules only apply to released branches. In master, or in a new > branch that has not yet reached RC status, it's better to place new > fields and enum values in natural positions. Changing function > signatures is fine too, unless there are so many callers that > a compatibility wrapper seems advisable. Those two sound fine. > > The other bit of documentation we probably need is some annotation in > > struct ResultRelInfo saying "do not change the sizeof() this struct > > in back branches, period". > > Something like > > /* > * DO NOT change sizeof(ResultRelInfo) in a released branch. This > * generally makes it unsafe to add fields here in a released branch. > */ > > at the end of struct ResultRelInfo's definition. I agree with adding a comment to the end of the struct. Thanks for writing all this. I'd also have that comment refer to ModifyTableState.resultRelInfo specifically. That way, if the ModifyTable implementation changes to remove this hazard, it's easier to trace the ability to remove this comment. > None of this is a substitute for installing some kind of ABI-checking > infrastructure; but that project doesn't seem to be moving fast. Right, they're complementary. The documentation is about what ABI checker complaints we choose to ignore. An extension buildfarm is another complementary idea that comes up occasionally.
On Nov 29, 2024, at 15:39, Noah Misch <noah@leadboat.com> wrote: > Then packagers who are taking the risk of not rebuilding every time will have > 3 months to prepare, not the 3 days we're currently giving. The point about > "well-known extensions" is based on my practice of grepping PGXN. That would > not have found timescaledb. Should we name PGXN explicitly, or should we be > vague like that draft? I'd be comfortable naming any number of repositories > that make it equally easy to bulk-download the whole repository. I’m wondering how we can help more projects make releases on PGXN, if for now other reason than to enable these kinds ofchecks without much additional effort. PGXN releases don’t require a ton of work, requiring just a META.json file in azip file, and there are GitHub workflows to automate such releases. Anyone interested please hit me up directly for details.I even make pull requests like this one for pg_partman[1]. Best, David [1]: https://github.com/pgpartman/pg_partman/pull/671/files
On 29.11.24 21:39, Noah Misch wrote: >> I remembered where that's documented: >> >> https://wiki.postgresql.org/wiki/ >> Committing_checklist#Maintaining_ABI_compatibility_while_backpatching > I'd say the source tree's <sect2 id="xfunc-api-abi-stability-guidance">, which > David Wheeler mentioned, is authoritative. It currently matches > postgr.es/c/e54a42a. Let's update both or change that wiki heading to point > to the authoritative doc section. I don't know if this proposing to merge the text in the wiki into the docs? The text in the docs is describing to extension authors what they can expect. The text in the wiki is advice for server developers how to attempt to satisfy those expectations. So I think those can be separate pieces of text.
Peter Eisentraut <peter@eisentraut.org> writes: > On 29.11.24 21:39, Noah Misch wrote: >> I'd say the source tree's <sect2 id="xfunc-api-abi-stability-guidance">, which >> David Wheeler mentioned, is authoritative. It currently matches >> postgr.es/c/e54a42a. Let's update both or change that wiki heading to point >> to the authoritative doc section. > I don't know if this proposing to merge the text in the wiki into the docs? > The text in the docs is describing to extension authors what they can > expect. The text in the wiki is advice for server developers how to > attempt to satisfy those expectations. So I think those can be separate > pieces of text. Yes, that was my feeling as well. The text on the wiki contemplates that we'd merge it into the main docs whenever it seems stable enough, but I'm not sure we're there yet (this discussion suggests not ;-)). In any case it feels like it should be a different chunk of text than xfunc-api-abi-stability-guidance, which is aimed at extension developers not core developers. regards, tom lane
On Wed, Dec 04, 2024 at 08:29:57PM +0100, Peter Eisentraut wrote: > On 29.11.24 21:39, Noah Misch wrote: > > > I remembered where that's documented: > > > > > > https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching > > I'd say the source tree's <sect2 id="xfunc-api-abi-stability-guidance">, which > > David Wheeler mentioned, is authoritative. It currently matches > > postgr.es/c/e54a42a. Let's update both or change that wiki heading to point > > to the authoritative doc section. I hereby revoke the last sentence and replace it with "Let's update both in some way." The revoked sentence and its replacement are probably the least important thing I wrote in that mail. > I don't know if this proposing to merge the text in the wiki into the docs? The revoked proposal wasn't that specific. 1576843.1732586210@sss.pgh.pa.us offered some wiki changes mostly about struct fields. Doc section xfunc-api-abi-stability-guidance also writes about struct fields. Specifically, it discusses "squeezing a new field into padding space or appending it to the end of a struct". If we change the wiki about struct fields, we should not leave that quote unchanged. I no longer wish to affect how we allocate text to the wiki vs. xfunc-api-abi-stability-guidance.