Potential ABI breakage in upcoming minor releases
Hello,
Commit 51ff46de29f67d73549b2858f57e77ada8513369 (backported all the way
back to v12) added a new member to `ResultRelInfo` struct. This can
potentially cause ABI breakage for the extensions that allocate the struct
and pass it down to the PG code. The previously built extensions may
allocate a shorter struct, while the new PG code would expect a larger
struct, thus overwriting some memory unintentionally.
A better approach may have been what Tom did in
8cd190e13a22dab12e86f7f1b59de6b9b128c784, but I understand it might be too
late to change this since the releases are already tagged. Nevertheless, I
thought of bringing it up if others have different views.
Thanks,
Pavan
On Thu, Nov 14, 2024 at 04:18:02PM +0530, Pavan Deolasee wrote:
Commit 51ff46de29f67d73549b2858f57e77ada8513369 (backported all the way
back to v12) added a new member to `ResultRelInfo` struct. This can
potentially cause ABI breakage for the extensions that allocate the struct
and pass it down to the PG code. The previously built extensions may
allocate a shorter struct, while the new PG code would expect a larger
struct, thus overwriting some memory unintentionally.A better approach may have been what Tom did in
8cd190e13a22dab12e86f7f1b59de6b9b128c784, but I understand it might be too
late to change this since the releases are already tagged. Nevertheless, I
thought of bringing it up if others have different views.
The release is now announced. We could issue new releases (likely next
Thursday) if there's a problem that warrants it.
Based on a grep of PGXN code, here are some or all of the modules that react
to sizeof(ResultRelInfo):
$ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u
apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo));
pg_pathman:: ResultRelInfo *resultRelInfo = (ResultRelInfo *) palloc(sizeof(ResultRelInfo));
$ grepx -r 'Node.*ResultRelInfo' | tee /tmp/2 | sed 's/-[^:]*/:/'|sort -u
apacheage:: cypher_node->resultRelInfo = makeNode(ResultRelInfo);
apacheage:: cypher_node->resultRelInfo = makeNode(ResultRelInfo);
citus:: resultRelInfo = makeNode(ResultRelInfo);
citus:: ResultRelInfo *resultRelInfo = makeNode(ResultRelInfo);
pg_bulkload:: checker->resultRelInfo = makeNode(ResultRelInfo);
pg_bulkload:: self->relinfo = makeNode(ResultRelInfo);
pg_pathman:: child_result_rel_info = makeNode(ResultRelInfo);
pg_pathman:: parent_result_rel = makeNode(ResultRelInfo);
pg_pathman:: parent_rri = makeNode(ResultRelInfo);
pg_pathman:: part_result_rel_info = makeNode(ResultRelInfo);
vops:: resultRelInfo = makeNode(ResultRelInfo);
I don't know whether we should make a new release, amend the release
announcement to call for extension rebuilds, or just stop here.
https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching
mentions the problem, but neither it nor the new standard at
postgr.es/c/e54a42a say how reticent we'll be to add to the end of a struct on
which extensions do sizeof.
The postgr.es/c/e54a42a standard would have us stop here. But I'm open to
treating the standard as mistaken and changing things.
On Thu, 14 Nov 2024 at 16:35, Noah Misch <noah@leadboat.com> wrote:
Based on a grep of PGXN code, here are some or all of the modules that
react
to sizeof(ResultRelInfo):
To add to this list, Christoph Berg confirmed that timescaledb test suite
crashes. [1]https://pgdgbuild.dus.dg-i.net/job/timescaledb-autopkgtest/9/architecture=amd64,distribution=sid/console
Regards,
Ants Aasma
[1]: https://pgdgbuild.dus.dg-i.net/job/timescaledb-autopkgtest/9/architecture=amd64,distribution=sid/console
https://pgdgbuild.dus.dg-i.net/job/timescaledb-autopkgtest/9/architecture=amd64,distribution=sid/console
Hi,
To add to this list, Christoph Berg confirmed that timescaledb test suite crashes. [1]
Yes changing ResultRelInfo most definetely breaks TimescaleDB. The
extension uses makeNode(ResultRelInfo) and this can't end-up well:
```
static inline Node *
newNode(size_t size, NodeTag tag)
{
Node *result;
Assert(size >= sizeof(Node)); /* need the tag, at least */
result = (Node *) palloc0(size);
result->type = tag;
return result;
}
#define makeNode(_type_) ((_type_ *) newNode(sizeof(_type_),T_##_type_))
```
--
Best regards,
Aleksander Alekseev
On 14.11.24 15:35, Noah Misch wrote:
The postgr.es/c/e54a42a standard would have us stop here. But I'm open to
treating the standard as mistaken and changing things.
That text explicitly calls out that adding struct members at the end of
a struct is considered okay. But thinking about it now, even adding
fields to the end of a node struct that extensions allocate using
makeNode() is an ABI break that is liable to cause all affected
extensions to break in a crashing way.
Re: Noah Misch
Based on a grep of PGXN code, here are some or all of the modules that react
to sizeof(ResultRelInfo):$ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u
apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo));
Confirmed, crashing: AGE for 14..17 (12..13 seem fine)
citus:: resultRelInfo = makeNode(ResultRelInfo);
citus:: ResultRelInfo *resultRelInfo = makeNode(ResultRelInfo);
pg_bulkload:: checker->resultRelInfo = makeNode(ResultRelInfo);
pg_bulkload:: self->relinfo = makeNode(ResultRelInfo);
pg_pathman:: child_result_rel_info = makeNode(ResultRelInfo);
pg_pathman:: parent_result_rel = makeNode(ResultRelInfo);
pg_pathman:: parent_rri = makeNode(ResultRelInfo);
pg_pathman:: part_result_rel_info = makeNode(ResultRelInfo);
vops:: resultRelInfo = makeNode(ResultRelInfo);
(These are not on apt.pg.o)
I've also tested other packages where ResultRelInfo appears in the
source, but they all passed the tests (most have decent test but a few
have not):
Bad:
postgresql-16-age
timescaledb
Good:
hypopg
libpg-query
pglast
pglogical
pgpool2
pgsql-ogr-fdw
pg-squeeze
postgresql-mysql-fdw (impossible to test sanely because mysql/mariadb
take turns at being the default, and in some environments just don't
start)
I don't know whether we should make a new release, amend the release
announcement to call for extension rebuilds, or just stop here.
https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching
mentions the problem, but neither it nor the new standard at
postgr.es/c/e54a42a say how reticent we'll be to add to the end of a struct on
which extensions do sizeof.
I'd say the ship has sailed, a new release would now break things the
other way round.
Christoph
Timescale at least has many users, I don't know about the others. But
they won't be happy if we let things be. IMO we should rewrap ...
especially 12, since there won't be a next one.
ISTM that we have spare bytes where we could place that boolean without
breaking ABI. In x86_64 there's a couple of 4-byte holes, but those
won't be there on x86, so not great candidates. Fortunately there are
3-byte and 7-byte holes also, which we can use safely. We can move the
new boolean to those location.
The holes are different in each branch unfortunately.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Use it up, wear it out, make it do, or do without"
On Thu, Nov 14, 2024 at 11:29 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
ISTM that we have spare bytes where we could place that boolean without
breaking ABI. In x86_64 there's a couple of 4-byte holes, but those
won't be there on x86, so not great candidates. Fortunately there are
3-byte and 7-byte holes also, which we can use safely. We can move the
new boolean to those location.
Wasn't this part of the official guidelines? I've been doing this all
along (e.g., in commit 3fa81b62e0).
The holes are different in each branch unfortunately.
Yeah, that'd make it a bit more complicated, but still doable. It
would be necessary to place the same field in seemingly random
locations on each backbranch.
FWIW recent versions of clangd will show me information about field
padding in struct annotations. I don't even have to run pahole.
--
Peter Geoghegan
On Thu, Nov 14, 2024 at 05:13:35PM +0100, Peter Eisentraut wrote:
On 14.11.24 15:35, Noah Misch wrote:
The postgr.es/c/e54a42a standard would have us stop here. But I'm open to
treating the standard as mistaken and changing things.That text explicitly calls out that adding struct members at the end of a
struct is considered okay. But thinking about it now, even adding fields to
the end of a node struct that extensions allocate using makeNode() is an ABI
break
Right. makeNode(), palloc(sizeof), and stack allocation have that problem.
Allocation wrappers like CreateExecutorState() avoid the problem. More
generally, structs allocated in non-extension code are fine.
Hi,
Right. makeNode(), palloc(sizeof), and stack allocation have that problem.
Allocation wrappers like CreateExecutorState() avoid the problem. More
generally, structs allocated in non-extension code are fine.
Perhaps we should consider adding an API like makeResultRelInfo() and
others, one per node. It's going to be boilerplate for sure, but
considering the fact that we already generate some code I don't really
see drawbacks.
--
Best regards,
Aleksander Alekseev
On Thu, Nov 14, 2024 at 9:43 PM Peter Eisentraut <peter@eisentraut.org>
wrote:
On 14.11.24 15:35, Noah Misch wrote:
The postgr.es/c/e54a42a standard would have us stop here. But I'm open
to
treating the standard as mistaken and changing things.
That text explicitly calls out that adding struct members at the end of
a struct is considered okay.
I think that assumption is ok when the struct is allocated by core and
passed to the extension. But if the struct is allocated (static or dynamic)
and passed to the core, we have to be extra careful.
But thinking about it now, even adding
fields to the end of a node struct that extensions allocate using
makeNode() is an ABI break that is liable to cause all affected
extensions to break in a crashing way.
Memory corruption issues can be quite subtle and hard to diagnose. So if
wrapping a new release is an option, I would vote for it. If we do that,
doing it for every impacted version would be more prudent. But I don't know
what are the policies around making a new release.
Thanks,
Pavan
On Thu, Nov 14, 2024 at 05:29:18PM +0100, Christoph Berg wrote:
Re: Noah Misch
Based on a grep of PGXN code, here are some or all of the modules that react
to sizeof(ResultRelInfo):$ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u
apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo));Confirmed, crashing: AGE for 14..17 (12..13 seem fine)
Can you share more about the crash, perhaps the following?
- stack trace
- any server messages just before the crash
- which architecture (32-bit x86, 64-bit ARM, etc.)
- asserts enabled, or not?
It's not immediately to clear to me why this would crash in a non-asserts
build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16,
so I expect no actual writing past the end of the chunk. I don't see
apacheage allocating a ResultRelInfo other than via one palloc per
ResultRelInfo (no arrays of them, no stack-allocated ResultRelInfo). I'll
also work on installing apacheage to get those answers locally.
On Thu, Nov 14, 2024 at 10:26:58AM -0800, Noah Misch wrote:
On Thu, Nov 14, 2024 at 05:29:18PM +0100, Christoph Berg wrote:
Re: Noah Misch
Based on a grep of PGXN code, here are some or all of the modules that react
to sizeof(ResultRelInfo):$ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u
apacheage:: resultRelInfo = palloc(sizeof(ResultRelInfo));Confirmed, crashing: AGE for 14..17 (12..13 seem fine)
Can you share more about the crash, perhaps the following?
- stack trace
- any server messages just before the crash
- which architecture (32-bit x86, 64-bit ARM, etc.)
- asserts enabled, or not?It's not immediately to clear to me why this would crash in a non-asserts
build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16,
so I expect no actual writing past the end of the chunk. I don't see
apacheage allocating a ResultRelInfo other than via one palloc per
ResultRelInfo (no arrays of them, no stack-allocated ResultRelInfo). I'll
also work on installing apacheage to get those answers locally.
On x86_64, I ran these with and without asserts:
install PostgreSQL 16.4
install https://github.com/apache/age/tree/master
make -C age installcheck
install PostgreSQL 16.5
make -C age installcheck
The non-asserts build passed. The asserts build failed with "+WARNING:
problem in alloc set ExecutorState: detected write past chunk" throughout the
diffs, but there were no crashes. (Note that AGE "make installcheck" creates
a temporary installation, unlike PostgreSQL "make installcheck".) What might
differ in how you tested?
Noah Misch <noah@leadboat.com> writes:
It's not immediately to clear to me why this would crash in a non-asserts
build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16,
so I expect no actual writing past the end of the chunk.
I'm confused too. The allocation should be big enough. The other
hazard would be failing to initialize the field, but if the extension
uses InitResultRelInfo then that's taken care of.
regards, tom lane
Re: Noah Misch
The non-asserts build passed. The asserts build failed with "+WARNING:
problem in alloc set ExecutorState: detected write past chunk" throughout the
diffs, but there were no crashes. (Note that AGE "make installcheck" creates
a temporary installation, unlike PostgreSQL "make installcheck".) What might
differ in how you tested?
The builds for sid (Debian unstable) are cassert-enabled and there the
tests threw a lot of these errors.
It didn't actually crash, true.
15:25:47 SELECT * FROM cypher('cypher_index', $$ MATCH(n) DETACH DELETE n $$) AS (a agtype);
15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8ae320
15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8aeab0
15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d88b730, chunk 0x55993d88d348
15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8ae320
15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d8ad710, chunk 0x55993d8aeab0
15:25:47 +WARNING: problem in alloc set ExecutorState: detected write past chunk end in block 0x55993d88b730, chunk 0x55993d88d348
15:25:47 a
15:25:47 ---
15:25:47 (0 rows)
I did not test other distribution releases because the test run
stopped after this touchstone test.
Christoph
Christoph Berg <myon@debian.org> writes:
Re: Noah Misch
The non-asserts build passed. The asserts build failed with "+WARNING:
problem in alloc set ExecutorState: detected write past chunk" throughout the
diffs, but there were no crashes. (Note that AGE "make installcheck" creates
a temporary installation, unlike PostgreSQL "make installcheck".) What might
differ in how you tested?
The builds for sid (Debian unstable) are cassert-enabled and there the
tests threw a lot of these errors.
It didn't actually crash, true.
Ah. So perhaps this is a non-issue for production (non-assert)
builds? That would change the urgency materially, IMO.
regards, tom lane
On 2024-Nov-14, Christoph Berg wrote:
It didn't actually crash, true.
But timescale did crash:
11:59:51 @@ -34,6 +40,7 @@
11:59:51 CREATE INDEX ON _hyper_1_3_chunk(temp2);
11:59:51 -- INSERT only will not trigger vacuum on indexes for PG13.3+
11:59:51 UPDATE vacuum_test SET time = time + '1s'::interval, temp1 = random(), temp2 = random();
11:59:51 --- we should see two parallel workers for each chunk
11:59:51 -VACUUM (PARALLEL 3) vacuum_test;
11:59:51 -DROP TABLE vacuum_test;
11:59:51 +server closed the connection unexpectedly
11:59:51 + This probably means the server terminated abnormally
11:59:51 + before or while processing the request.
11:59:51 +connection to server was lost
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2024-Nov-14, Christoph Berg wrote:
It didn't actually crash, true.
But timescale did crash:
So what is timescale doing differently?
regards, tom lane
Hi,
The allocation should be big enough. The other
hazard would be failing to initialize the field, but if the extension
uses InitResultRelInfo then that's taken care of.
So what is timescale doing differently?
I see 3 usages of makeNode(ResultRelInfo) in Timescale:
- src/nodes/chunk_dispatch/chunk_insert_state.c
- src/copy.c
- src/ts_catalog/catalog.c
In the first case it's followed by InitResultRelInfo(). In the second
- by ExecInitResultRelation() in its turn calls InitResultRelInfo().
The third case is the following:
```
extern TSDLLEXPORT ResultRelInfo *
ts_catalog_open_indexes(Relation heapRel)
{
ResultRelInfo *resultRelInfo;
resultRelInfo = makeNode(ResultRelInfo);
resultRelInfo->ri_RangeTableIndex = 0; /* dummy */
resultRelInfo->ri_RelationDesc = heapRel;
resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */
ExecOpenIndices(resultRelInfo, false);
return resultRelInfo;
}
```
Where it's used from there is hard to track but it looks like this is
the reason why the new field ri_needLockTagTuple is not initialized.
I'll pass this piece of information to my colleagues.
--
Best regards,
Aleksander Alekseev
On 2024-Nov-14, Tom Lane wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2024-Nov-14, Christoph Berg wrote:
It didn't actually crash, true.
But timescale did crash:
So what is timescale doing differently?
No idea. Maybe a stacktrace from a core would tell us more; but the
jenkins task doesn't seem to have server logs or anything else to gives
us more clues.
There are three makeNode(ResultRelInfo), but they don't look anything
out of the ordinary:
C perhan: timescaledb 0$ git grep -C5 makeNode.ResultRelInfo
src/copy.c- *
src/copy.c- * WARNING. The dummy rangetable index is decremented by 1 (unchecked)
src/copy.c- * inside `ExecConstraints` so unless you want to have a overflow, keep it
src/copy.c- * above zero. See `rt_fetch` in parsetree.h.
src/copy.c- */
src/copy.c: resultRelInfo = makeNode(ResultRelInfo);
src/copy.c-
src/copy.c-#if PG16_LT
src/copy.c- ExecInitRangeTable(estate, pstate->p_rtable);
src/copy.c-#else
src/copy.c- Assert(pstate->p_rteperminfos != NULL);
--
src/nodes/chunk_dispatch/chunk_insert_state.c-create_chunk_result_relation_info(const ChunkDispatch *dispatch, Relation rel)
src/nodes/chunk_dispatch/chunk_insert_state.c-{
src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri;
src/nodes/chunk_dispatch/chunk_insert_state.c- ResultRelInfo *rri_orig = dispatch->hypertable_result_rel_info;
src/nodes/chunk_dispatch/chunk_insert_state.c- Index hyper_rti = rri_orig->ri_RangeTableIndex;
src/nodes/chunk_dispatch/chunk_insert_state.c: rri = makeNode(ResultRelInfo);
src/nodes/chunk_dispatch/chunk_insert_state.c-
src/nodes/chunk_dispatch/chunk_insert_state.c- InitResultRelInfo(rri, rel, hyper_rti, NULL, dispatch->estate->es_instrument);
src/nodes/chunk_dispatch/chunk_insert_state.c-
src/nodes/chunk_dispatch/chunk_insert_state.c- /* Copy options from the main table's (hypertable's) result relation info */
src/nodes/chunk_dispatch/chunk_insert_state.c- rri->ri_WithCheckOptions = rri_orig->ri_WithCheckOptions;
--
src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo *
src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel)
src/ts_catalog/catalog.c-{
src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo;
src/ts_catalog/catalog.c-
src/ts_catalog/catalog.c: resultRelInfo = makeNode(ResultRelInfo);
src/ts_catalog/catalog.c- resultRelInfo->ri_RangeTableIndex = 0; /* dummy */
src/ts_catalog/catalog.c- resultRelInfo->ri_RelationDesc = heapRel;
src/ts_catalog/catalog.c- resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */
src/ts_catalog/catalog.c-
src/ts_catalog/catalog.c- ExecOpenIndices(resultRelInfo, false);
I didn't catch any other allocations that would depend on the size of
ResultRelInfo in obvious ways.
Oh, hmm, there's this bit which uses struct assignment into a stack
element:
tsl/src/compression/compression.c-1559- if (decompressor->indexstate->ri_NumIndices > 0)
tsl/src/compression/compression.c-1560- {
tsl/src/compression/compression.c:1561: ResultRelInfo indexstate_copy = *decompressor->indexstate;
tsl/src/compression/compression.c-1562- Relation single_index_relation;
I find this rather suspect.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"La espina, desde que nace, ya pincha" (Proverbio africano)