LLVM 22

Started by Thomas Munro5 months ago21 messageshackers

thomas.munro@gmail.com

5 months ago

Hi,

Ideally we should have all changes for LLVM 22 in our February minor
releases. I have written up some notes on release synchronisation on
the wiki[1]https://wiki.postgresql.org/wiki/LLVM#Cadence to show the scheduling problem if we don't. The second
patch here still needs some validation.

1. We won't need our local llvm::backport::SectionMemoryManager for
LLVM 22, so it will be nice to draw a line under that messy business.
See commit message for details.

You can review the differences between our in-tree copy and the code
that was finally committed and will shortly ship in LLVM 22 like this:

LLVM_BRANCH=main
LLVM_URL=https://raw.githubusercontent.com/llvm/llvm-project/refs/heads

curl -s \
$LLVM_URL/$LLVM_BRANCH/llvm/include/llvm/ExecutionEngine/SectionMemoryManager.h
| \
diff -u - src/include/jit/SectionMemoryManager.h

curl -s \
$LLVM_URL/$LLVM_BRANCH/llvm/lib/ExecutionEngine/SectionMemoryManager.cpp | \
diff -u - src/backend/jit/llvm/SectionMemoryManager.cpp

In a week or two, LLVM_BRANCH=release/22.x should work too. I've
attached the output, which shows the expected changes in our copy,
namely:

* top-of-file comments
* namespace change
* tweaks for older LLVM versions
* tree-wide spellchecks and #include "" -> <> changes

They haven't made any changes on their side, except for adding some
LLVM_ABI macros added in LLVM 20 that we missed. See commit message
for why we don't want those.

The place in llvmjit_backport.h that does:

-#if defined(__aarch64__)
+#if defined(__aarch64__) && LLVM_VERSION_MAJOR < 22
 #define USE_LLVM_BACKPORT_SECTION_MEMORY_MANAGER

... would be like this in REL_17_STABLE and earlier:

+#if defined(__aarch64__) && LLVM_VERSION_MAJOR > 11 && LLVM_VERSION_MAJOR < 22

That's because we never made the backport work with LLVM < 12, and I
have heard no complaints about that so at this point it looks like we
got away with it.

2. LLVM 22 changed the semantics of the "lifetime.end" instruction.
See commit message for references. Without this change, LLVM main/22
assertions fail in the regression tests with messages like this in
postmaster.log:

Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
2026-01-02 17:28:31.394 NZDT client backend[42798] pg_regress/boolean
FATAL: fatal llvm error: Broken module found, compilation aborted!

I haven't seen anything bad happen in non-assertion builds.

Here's a potential minimal fix. I haven't yet proven that the
optimisation is still working as expected. Probably need to compile
an expression that calls an inlined function and then a non-inlined
function with jit_dump_bitcode=true, then find the right XXX.bc file
under pgdata, llvm-dis XXX.bc, llc XXX.ll, then visually inspect XXX.s
with enough caffeine to confirm that it's not spilling something (ie
store instructions) where previously it didn't, but I wanted to post
what I had so far to see if anyone has a better idea or an easy way to
test it...

[1]: https://wiki.postgresql.org/wiki/LLVM#Cadence

Thomas Munro

thomas.munro@gmail.com

5 months ago

In reply to: Thomas Munro (#1)

Re: LLVM 22

On Sat, Jan 3, 2026 at 3:02 PM Thomas Munro <thomas.munro@gmail.com> wrote:

1. We won't need our local llvm::backport::SectionMemoryManager for
LLVM 22, so it will be nice to draw a line under that messy business.
See commit message for details.

While that's true, there is a problem with the patch I posted:
"ReserveAlloc" is not enabled when called from C. I can't actually
reproduce the issue locally due to lack of RAM connected to an ARM
CPU, or I'd have noticed that... I'll attempt to do something about
that upstream[1]https://github.com/llvm/llvm-project/issues/174305, let's see... if not, we can still use the new
in-tree SectionMemoryManager, but we'll still need some C++ glue code.

[1]: https://github.com/llvm/llvm-project/issues/174305

Matheus Alcantara

matheusssilv97@gmail.com

5 months ago

In reply to: Thomas Munro (#1)

Re: LLVM 22

Hi,

On Fri Jan 2, 2026 at 11:02 PM -03, Thomas Munro wrote:

2. LLVM 22 changed the semantics of the "lifetime.end" instruction.
See commit message for references. Without this change, LLVM main/22
assertions fail in the regression tests with messages like this in
postmaster.log:

Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
2026-01-02 17:28:31.394 NZDT client backend[42798] pg_regress/boolean
FATAL: fatal llvm error: Broken module found, compilation aborted!

I've managed to reproduce this using LLVM 22.

Here's a potential minimal fix. I haven't yet proven that the
optimisation is still working as expected. Probably need to compile
an expression that calls an inlined function and then a non-inlined
function with jit_dump_bitcode=true, then find the right XXX.bc file
under pgdata, llvm-dis XXX.bc, llc XXX.ll, then visually inspect XXX.s
with enough caffeine to confirm that it's not spilling something (ie
store instructions) where previously it didn't, but I wanted to post
what I had so far to see if anyone has a better idea or an easy way to
test it...

I'm not super familiar with reading assembly code but I tried my best to
inspect the LLVM 22 and LLVM 21 outputs and if I understood correctly I
think that 0002 is working as expected.

I've noticed a reduction on some instructions when using LLVM 22 with
the 0002 patch compared with LLVM 21. For example, here we needed less
instructions to set up the registers:

LLVM 22:
LBB2_8: ; %b.op.1.start
mov x20, #40824 ; =0x9f78
movk x20, #19456, lsl #16
movk x20, #1, lsl #32
ldr x8, [x23]
ldrb w9, [x24]
str x8, [x20, #152]
strb w9, [x20, #160]

LLVM 21:
LBB2_8: ; %b.op.1.start
mov x25, #25352 ; =0x6308
movk x25, #2946, lsl #16
movk x25, #1, lsl #32
mov x20, #23533 ; =0x5bed
movk x20, #2946, lsl #16
movk x20, #1, lsl #32
ldr x8, [x23]
ldrb w9, [x24]
stur x8, [x25, #-248]
sturb w9, [x25, #-240]

I've also noticed that the generated assembly code for LLVM 22 use the
str and strb instructions instead of stur and sturb in some cases, which
according to IA is an improvement but unfortunately I did not find any
reference to prove this, sorry.

To test this I did the following steps:
set jit_above_cost = 0;
set jit_inline_above_cost = 0;
set jit_optimize_above_cost = 0;
set jit_dump_bitcode = true;

explain(analyze) select i % 2 = 0 OR i % 3 = 0 from generate_series(1, 100) i;

I'm attaching the .s files for the llvm 22 and for the llvm 21 outputs
that I used to inspect.

--
Matheus Alcantara
EDB: https://www.enterprisedb.com

Thomas Munro

thomas.munro@gmail.com

5 months ago

In reply to: Matheus Alcantara (#3)

Re: LLVM 22

On Tue, Jan 6, 2026 at 10:56 AM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:

On Fri Jan 2, 2026 at 11:02 PM -03, Thomas Munro wrote:

Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
2026-01-02 17:28:31.394 NZDT client backend[42798] pg_regress/boolean
FATAL: fatal llvm error: Broken module found, compilation aborted!

I've managed to reproduce this using LLVM 22.

Thanks for testing!

Here's a potential minimal fix. I haven't yet proven that the
optimisation is still working as expected. Probably need to compile
an expression that calls an inlined function and then a non-inlined
function with jit_dump_bitcode=true, then find the right XXX.bc file
under pgdata, llvm-dis XXX.bc, llc XXX.ll, then visually inspect XXX.s
with enough caffeine to confirm that it's not spilling something (ie
store instructions) where previously it didn't, but I wanted to post
what I had so far to see if anyone has a better idea or an easy way to
test it...

I'm not super familiar with reading assembly code but I tried my best to
inspect the LLVM 22 and LLVM 21 outputs and if I understood correctly I
think that 0002 is working as expected.

Cool. And as another sanity test, if you comment out the new poison
code so that we don't try to prevent unwanted spills/stores, can you
see any?

I've noticed a reduction on some instructions when using LLVM 22 with
the 0002 patch compared with LLVM 21. For example, here we needed less
instructions to set up the registers:

LLVM 22:
LBB2_8: ; %b.op.1.start
mov x20, #40824 ; =0x9f78
movk x20, #19456, lsl #16
movk x20, #1, lsl #32
ldr x8, [x23]
ldrb w9, [x24]
str x8, [x20, #152]
strb w9, [x20, #160]

LLVM 21:
LBB2_8: ; %b.op.1.start
mov x25, #25352 ; =0x6308
movk x25, #2946, lsl #16
movk x25, #1, lsl #32
mov x20, #23533 ; =0x5bed
movk x20, #2946, lsl #16
movk x20, #1, lsl #32
ldr x8, [x23]
ldrb w9, [x24]
stur x8, [x25, #-248]
sturb w9, [x25, #-240]

I've also noticed that the generated assembly code for LLVM 22 use the
str and strb instructions instead of stur and sturb in some cases, which
according to IA is an improvement but unfortunately I did not find any
reference to prove this, sorry.

Interesting.

Matheus Alcantara

matheusssilv97@gmail.com

5 months ago

In reply to: Thomas Munro (#4)

Re: LLVM 22

On Mon Jan 5, 2026 at 8:50 PM -03, Thomas Munro wrote:

Here's a potential minimal fix. I haven't yet proven that the
optimisation is still working as expected. Probably need to compile
an expression that calls an inlined function and then a non-inlined
function with jit_dump_bitcode=true, then find the right XXX.bc file
under pgdata, llvm-dis XXX.bc, llc XXX.ll, then visually inspect XXX.s
with enough caffeine to confirm that it's not spilling something (ie
store instructions) where previously it didn't, but I wanted to post
what I had so far to see if anyone has a better idea or an easy way to
test it...

I'm not super familiar with reading assembly code but I tried my best to
inspect the LLVM 22 and LLVM 21 outputs and if I understood correctly I
think that 0002 is working as expected.

Cool. And as another sanity test, if you comment out the new poison
code so that we don't try to prevent unwanted spills/stores, can you
see any?

Yes, I've commented the poison code block introduced on 0002 and the
generated assembly code seems more bloated, for example:

LLVM 22 with 0002 and the poison code block commented:
LBB2_8:
mov x25, #15624 ; =0x3d08
movk x25, #7427, lsl #16
movk x25, #1, lsl #32
mov x20, #13805 ; =0x35ed
movk x20, #7427, lsl #16
movk x20, #1, lsl #32
ldr x8, [x23]
ldrb w9, [x24]
stur x8, [x25, #-248]
sturb w9, [x25, #-240]
mov w26, #1 ; =0x1
strb w26, [x20, #1419]
ldurb w8, [x25, #-240]
cmp w8, #1
b.eq LBB2_11

LLVM 22 with 0002:
LBB2_8:
mov x20, #40824 ; =0x9f78
movk x20, #19456, lsl #16
movk x20, #1, lsl #32
ldr x8, [x23]
ldrb w9, [x24]
str x8, [x20, #152]
strb w9, [x20, #160]
mov w26, #1 ; =0x1
strb w26, [x20]
cmp w9, #1
b.eq LBB2_11

IIUC with the commented code the LLVM compiler added an extra
load ldurb followed by cmp w8, #1. With the patch it performs a
comparison cmp w9, #1 directly using a register it already has.

--
Matheus Alcantara
EDB: https://www.enterprisedb.com

Thomas Munro

thomas.munro@gmail.com

5 months ago

In reply to: Thomas Munro (#2)

Re: LLVM 22

On Sun, Jan 4, 2026 at 6:02 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Sat, Jan 3, 2026 at 3:02 PM Thomas Munro <thomas.munro@gmail.com> wrote:

1. We won't need our local llvm::backport::SectionMemoryManager for
LLVM 22, so it will be nice to draw a line under that messy business.
See commit message for details.

While that's true, there is a problem with the patch I posted:
"ReserveAlloc" is not enabled when called from C. I can't actually
reproduce the issue locally due to lack of RAM connected to an ARM
CPU, or I'd have noticed that... I'll attempt to do something about
that upstream[1], let's see... if not, we can still use the new
in-tree SectionMemoryManager, but we'll still need some C++ glue code.

That was successful, so here is an update.

A new unrelated assertion started firing in LLVM main/22 a few days ago:

v_nullbytemask = l_int8_const(lc, 1 << ((attnum) & 0x07));
Assertion failed: (llvm::isUIntN(BitWidth, val) && "Value is not
an N-bit unsigned value")

Here is a fix for that.

Anthonin Bonnefoy

anthonin.bonnefoy@datadoghq.com

5 months ago

In reply to: Thomas Munro (#6)

Re: LLVM 22

Hi,

I've tried to generate multiple bitcode for a simple 'select aid % 2
FROM pgbench_accounts limit 10;' query. To keep bitcode simple, I've
modified the passes to use "default<O0>,mem2reg,inline" when we have
JIT inline without optimization (as described in [0]/messages/by-id/CAO6_XqrNjJnbn15ctPv7o4yEAT9fWa-dK15RSyun6QNw9YDtKg@mail.gmail.com). I've tried the
following
- LLVM21: With lifetime
- LLVM21: Without lifetime
- LLVM22: With Poison
- LLVM22: Without Poison

In the 4 scenarios, the generated bc were the same with the exact same
instructions. Removing the lifetime end or the poison value doesn't
seem to change anything at this level of optimisation.

I'm not sure how to interpret this. Maybe the test is incorrect and a
different function needs to be called to possibly trigger the issue?
Or the poison/lifetime is only useful when going through the O3
optimisation pass?

[0]: /messages/by-id/CAO6_XqrNjJnbn15ctPv7o4yEAT9fWa-dK15RSyun6QNw9YDtKg@mail.gmail.com

Thomas Munro

thomas.munro@gmail.com

4 months ago

In reply to: Thomas Munro (#6)

Re: LLVM 22

On Sun, Jan 11, 2026 at 8:09 PM Thomas Munro <thomas.munro@gmail.com> wrote:

A new unrelated assertion started firing in LLVM main/22 a few days ago:

v_nullbytemask = l_int8_const(lc, 1 << ((attnum) & 0x07));
Assertion failed: (llvm::isUIntN(BitWidth, val) && "Value is not
an N-bit unsigned value")

Here is a fix for that.

22 was branched and RC1 is out, but that particular change was
reverted from 22[1]https://github.com/llvm/llvm-project/commit/16bf1c5d6b7f8fda16da5df5a2b195a6b10d08ed. It had already been through a commit/revert
cycle before and at a wild guess, it probably caused too much work
elsewhere with not enough notice. It's still present in main, so
consider the v2-0003 patch booted out of here and into the
not-yet-created LLVM 23 thread...

[1]: https://github.com/llvm/llvm-project/commit/16bf1c5d6b7f8fda16da5df5a2b195a6b10d08ed

Andres Freund

andres@anarazel.de

4 months ago

In reply to: Anthonin Bonnefoy (#7)

Re: LLVM 22

Hi,

On 2026-01-14 17:12:45 +0100, Anthonin Bonnefoy wrote:

I've tried to generate multiple bitcode for a simple 'select aid % 2
FROM pgbench_accounts limit 10;' query. To keep bitcode simple, I've
modified the passes to use "default<O0>,mem2reg,inline" when we have
JIT inline without optimization (as described in [0]). I've tried the
following
- LLVM21: With lifetime
- LLVM21: Without lifetime
- LLVM22: With Poison
- LLVM22: Without Poison

In the 4 scenarios, the generated bc were the same with the exact same
instructions. Removing the lifetime end or the poison value doesn't
seem to change anything at this level of optimisation.

I'm not sure how to interpret this. Maybe the test is incorrect and a
different function needs to be called to possibly trigger the issue?
Or the poison/lifetime is only useful when going through the O3
optimisation pass?

I think it's the latter - at -O0 there's nothing that could use the
information.

The goal of the lifetime annotations was to allow llvm to remove stores an
loads of FunctionCallInfo->{args,isnull}. After we stored e.g. fcinfo->isnull
before a function call and then checked it after the function call, we don't
need it anymore. I think that can only matter when the called function is
actually inlined, otherwise there's no way that LLVM can see the store is
unnecessary.

Unfortunately there's an issue with modern LLVM, regardless of lifetime or
poison. Generally it's able to eliminate stores that are followed by a
poison, but if there's a load inbetween, it fails. The odd part is that it
*is* able to eliminate the load (by forwarding the stored value).

It seems to be an ordering issue - instcombine is required to remove the load,
but also removes the poison, which in turn is required for dead store
elimination. Gngng.

I've attached a reproducer.

I'm not sure the llvm folks will be all that interested - there's no real C
correspondance to this. And, as it turns out, if I feed the memory to
something like free(), the analysis actually *does* figure out that it's not
needed anymore.

I think if / once we move most of this to a stack allocation, the problem
would also vanish.

Greetings,

Andres Freund

#10

Anthonin Bonnefoy

anthonin.bonnefoy@datadoghq.com

4 months ago

In reply to: Andres Freund (#9)

Re: LLVM 22

On Thu, Jan 29, 2026 at 2:27 AM Andres Freund <andres@anarazel.de> wrote:

The goal of the lifetime annotations was to allow llvm to remove stores an
loads of FunctionCallInfo->{args,isnull}. After we stored e.g. fcinfo->isnull
before a function call and then checked it after the function call, we don't
need it anymore. I think that can only matter when the called function is
actually inlined, otherwise there's no way that LLVM can see the store is
unnecessary.

Thanks for the context, that makes things easier to understand.

I've run another test using:
- "select pg_last_xact_replay_timestamp();" for the query, compared to
int4mod, has a reachable PG_RETURN_NULL.
- run with "options='-cjit_inline_above_cost=0
-cjit_optimize_above_cost=100000 -cjit_above_cost=0
-cjit_dump_bitcode=true'" to force inlining while only going through
O0 pass.
- Then manually ran the optimisation pass with "opt-21
jit_initial_dump.ll --passes='default<O3>' -S"

The initial dump is using lifetime.end, but it can be used to check
what happens with poisoned values by manually replacing it.

Using lifetime_end, the store to isnull:
28:
store i8 1, ptr inttoptr (i64 200635374787156 to ptr), align 4
br label %pg_last_xact_replay_timestamp.exit
is indeed removed.

Removing the lifetime_end calls, the store call is still present (I
wanted to make sure it wasn't removed by another optimization)
Replacing the lifetime_end calls with poison stores generates the same
IR as if there was no lifetime_end, and the store call is still
present. Tested with opt-21 and opt-22.

So it looks like that using poison value doesn't replicate
lifetime_end behaviour (at least, for the jit dump I've tested).

#11

Devrim GÜNDÜZ

devrim@gunduz.org

3 months ago

In reply to: Thomas Munro (#6)

Re: LLVM 22

Hi,

On Sun, 2026-01-11 at 20:09 +1300, Thomas Munro wrote:

A new unrelated assertion started firing in LLVM main/22 a few days
ago:

v_nullbytemask = l_int8_const(lc, 1 << ((attnum) & 0x07));
Assertion failed: (llvm::isUIntN(BitWidth, val) && "Value is not
an N-bit unsigned value")

Here is a fix for that.

Fedora pushed 22.1.0 to both Fedora 44 beta and rawhide repos, so I
tested these patches. Builds are fine and all regression tests pass.
Anything else I should check?

Regards,
--
Devrim Gündüz
Open Source Solution Architect, PostgreSQL Major Contributor
BlueSky: @devrim.gunduz.org , @gunduz.org

#12

Tom Lane

tgl@sss.pgh.pa.us

about 2 months ago

In reply to: Devrim GÜNDÜZ (#11)

Re: LLVM 22

Devrim =?ISO-8859-1?Q?G=FCnd=FCz?= <devrim@gunduz.org> writes:

Fedora pushed 22.1.0 to both Fedora 44 beta and rawhide repos, so I
tested these patches. Builds are fine and all regression tests pass.
Anything else I should check?

Where are we on getting these patches pushed? I think the reason that
BF member midge has been failing of late is that it's running LLVM 22
(if not indeed something even newer --- configure doesn't report
the clang version, sadly). Also, I've reproduced this symptom:

Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
2026-03-31 18:58:20.218 EDT [28486] FATAL: fatal llvm error: Broken module found, compilation aborted!

on a fresh Fedora 44/x86_64 installation with llvm 22.1.1. So this is
going to be a production compiler RSN. (F44 is still labeled beta,
but not for much longer.)

regards, tom lane

#13

Thomas Munro

thomas.munro@gmail.com

about 2 months ago

In reply to: Tom Lane (#12)

Re: LLVM 22

On Wed, Apr 1, 2026 at 12:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Devrim =?ISO-8859-1?Q?G=FCnd=FCz?= <devrim@gunduz.org> writes:

Fedora pushed 22.1.0 to both Fedora 44 beta and rawhide repos, so I
tested these patches. Builds are fine and all regression tests pass.
Anything else I should check?

Where are we on getting these patches pushed? I think the reason that
BF member midge has been failing of late is that it's running LLVM 22
(if not indeed something even newer --- configure doesn't report
the clang version, sadly). Also, I've reproduced this symptom:

Working on this, more shortly... I'm trying to figure out if Anthonin
and Andres's feedback means the poison approach does nothing useful
and we might as well just #ifdef out the lifetime.end stuff for LLVM

= 22 to fix the breakage today.

Either way it looks like we need a patch to use alloca instead, which
I'll also look into...

Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
Intrinsic has incorrect argument type!
ptr @llvm.lifetime.end.p0
2026-03-31 18:58:20.218 EDT [28486] FATAL: fatal llvm error: Broken module found, compilation aborted!

on a fresh Fedora 44/x86_64 installation with llvm 22.1.1. So this is
going to be a production compiler RSN. (F44 is still labeled beta,
but not for much longer.)

Yep, that's the issue alright.

#14

Thomas Munro

thomas.munro@gmail.com

about 2 months ago

In reply to: Thomas Munro (#13)

Re: LLVM 22

On Wed, Apr 1, 2026 at 4:25 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Working on this, more shortly... I'm trying to figure out if Anthonin
and Andres's feedback means the poison approach does nothing useful
and we might as well just #ifdef out the lifetime.end stuff for LLVM

= 22 to fix the breakage today.

Done. Hopefully midge and Devrim will now turn green :-)

Either way it looks like we need a patch to use alloca instead, which
I'll also look into...

I see a few options, but I need to hack on them for a while to figure
out the tradeoffs, or what I'm missing... after the freeze.

#15

Tom Lane

tgl@sss.pgh.pa.us

about 2 months ago

In reply to: Thomas Munro (#14)

Re: LLVM 22

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Apr 1, 2026 at 4:25 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Working on this, more shortly... I'm trying to figure out if Anthonin
and Andres's feedback means the poison approach does nothing useful
and we might as well just #ifdef out the lifetime.end stuff for LLVM

= 22 to fix the breakage today.

Done. Hopefully midge and Devrim will now turn green :-)

Just out of curiosity: I see you back-patched that all the way,
but midge had only been failing on v18 and HEAD. Were you just
being defensive, or is there something deeper there?

regards, tom lane

#16

Thomas Munro

thomas.munro@gmail.com

about 2 months ago

In reply to: Tom Lane (#15)

Re: LLVM 22

On Thu, Apr 2, 2026 at 4:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Apr 1, 2026 at 4:25 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Working on this, more shortly... I'm trying to figure out if Anthonin
and Andres's feedback means the poison approach does nothing useful
and we might as well just #ifdef out the lifetime.end stuff for LLVM

= 22 to fix the breakage today.

Done. Hopefully midge and Devrim will now turn green :-)

Just out of curiosity: I see you back-patched that all the way,
but midge had only been failing on v18 and HEAD. Were you just
being defensive, or is there something deeper there?

It was failing locally for me on all branches.

I don't know why midge wasn't failing on 14-17. Could jit be disabled
somewhere secret? Aarch64 vs amd64, but this issue doesn't seem to be
architecture related, it's IR-level.

#17

Tom Lane

tgl@sss.pgh.pa.us

about 2 months ago

In reply to: Thomas Munro (#16)

Re: LLVM 22

Thomas Munro <thomas.munro@gmail.com> writes:

On Thu, Apr 2, 2026 at 4:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Just out of curiosity: I see you back-patched that all the way,
but midge had only been failing on v18 and HEAD. Were you just
being defensive, or is there something deeper there?

It was failing locally for me on all branches.

Ah, thanks for that detail.

I don't know why midge wasn't failing on 14-17. Could jit be disabled
somewhere secret? Aarch64 vs amd64, but this issue doesn't seem to be
architecture related, it's IR-level.

Definitely not arch-specific, because I reproduced it on x86_64.
midge's lack of failure is odd then, but I'm not sure it's worth
expending a lot of brain cells on.

regards, tom lane

#18

Devrim GÜNDÜZ

devrim@gunduz.org

about 2 months ago

In reply to: Thomas Munro (#14)

Re: LLVM 22

Hi,

On Thu, 2026-04-02 at 16:20 +1300, Thomas Munro wrote:

On Wed, Apr 1, 2026 at 4:25 PM Thomas Munro <thomas.munro@gmail.com>
wrote:

Working on this, more shortly... I'm trying to figure out if
Anthonin
and Andres's feedback means the poison approach does nothing useful
and we might as well just #ifdef out the lifetime.end stuff for LLVM

= 22 to fix the breakage today.

Done. Hopefully midge and Devrim will now turn green :-)

Thanks a lot! I built all supported releases on Fedora 44.

Regards,
--
Devrim Gündüz
Open Source Solution Architect, PostgreSQL Major Contributor
BlueSky: @devrim.gunduz.org , @gunduz.org

#19

Andres Freund

andres@anarazel.de

about 2 months ago

In reply to: Thomas Munro (#14)

Re: LLVM 22

Hi,

On 2026-04-02 16:20:41 +1300, Thomas Munro wrote:

On Wed, Apr 1, 2026 at 4:25 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Working on this, more shortly... I'm trying to figure out if Anthonin
and Andres's feedback means the poison approach does nothing useful
and we might as well just #ifdef out the lifetime.end stuff for LLVM

= 22 to fix the breakage today.

Done. Hopefully midge and Devrim will now turn green :-)

Thanks!

Either way it looks like we need a patch to use alloca instead, which
I'll also look into...

I see a few options, but I need to hack on them for a while to figure
out the tradeoffs, or what I'm missing... after the freeze.

I've experimented a bunch with this, it seems we need the larger changes done
as part of the patchset for removing pointers from the expressions to actually
allow recent-ish LLVM to optimize this. I did verify that what we did didn't
have an effect with any other recent LLVM either.

The real fix here might be to have a separate calling convention for the very
common case of a scalar stable function with 1-3 arguments. We loose a fair
bit of efficiency even in interpreted execution due to ferrying arguments,
their nullness, and the nullness of the return value through memory.

Greetings,

Andres Freund

#20

Thomas Munro

thomas.munro@gmail.com

about 2 months ago

In reply to: Andres Freund (#19)

Re: LLVM 22

On Fri, Apr 3, 2026 at 3:31 AM Andres Freund <andres@anarazel.de> wrote:

I see a few options, but I need to hack on them for a while to figure
out the tradeoffs, or what I'm missing... after the freeze.

I've experimented a bunch with this, it seems we need the larger changes done
as part of the patchset for removing pointers from the expressions to actually
allow recent-ish LLVM to optimize this. I did verify that what we did didn't
have an effect with any other recent LLVM either.

Yeah, I noticed this connection as well, coming at it from a keyhole
how-do-I-fix-THIS-problem angle. It seemed to me that where
ExecInitFunc() builds the code to compute argument values to push into
&fcinfo->args[argno].value (a palloc'd FunctionCallInfoData object),
it might first alloca the space and store the collid etc (and after
return, it could lifetime.end it, or maybe the eventual ret in the
caller is enough but I don't see any reason not to lifetime.end it
ASAP), and then the destination would become a pointer into that, and
the most natural thing would be a stack pointer-relative one, and then
you'd have removed a major source of non-cacheability of compiled
expressions. It took me a while to grok the function argument layout,
which is ... this might be a stretch... a bit like Fortran, neither a
linear stack nor a spaghetti stack, but just a bag of variables ready
to be used as functions arguments, with recursion not permitted. And
also to grok the quirks of our V1 calls that compelled you to do it
like that. But I'm still learning the secrets of this code and I may
be way off base in these musings, I haven't actually tried anything
and it sounds like I should keep out of your way...

The real fix here might be to have a separate calling convention for the very
common case of a scalar stable function with 1-3 arguments. We loose a fair
bit of efficiency even in interpreted execution due to ferrying arguments,
their nullness, and the nullness of the return value through memory.

Yeah. I understand much better why you say that now.
FunctionCallInfoData holds data with two different lifetimes, some of
which might not be needed.

#21

Thomas Munro

thomas.munro@gmail.com

about 2 months ago

In reply to: Thomas Munro (#20)

LLVM 22

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: