Failed assertion with jit enabled

Started by Bertrand Drouvot11 months ago7 messages
#1Bertrand Drouvot
bertranddrouvot.pg@gmail.com

Hi hackers,

I was doing some tests and managed to trigger a failed assertion with jit
enabled.

The test can be simplified to:

postgres=# select count(*) from generate_series(1,10000000);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

The failed assertion is:

TRAP: failed Assert("false"), File: "llvmjit_expr.c", Line: 2833, PID: 3060333

"git bisect" tells me it has been introduced with 80feb727c86.

I did not look at the exact details, just reporting the issue.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bertrand Drouvot (#1)
Re: Failed assertion with jit enabled

Bertrand Drouvot <bertranddrouvot.pg@gmail.com> writes:

I was doing some tests and managed to trigger a failed assertion with jit
enabled.

The test can be simplified to:

postgres=# select count(*) from generate_series(1,10000000);
server closed the connection unexpectedly

Hmm, works for me on today's HEAD.

The failed assertion is:
TRAP: failed Assert("false"), File: "llvmjit_expr.c", Line: 2833, PID: 3060333

There are two Assert(false) in that file, neither of them particularly
close to line 2833 as of today, so I'm not sure what version you are
using or which one you're hitting. Nonetheless, I'll go out on a
limb and guess that this will go away if you do "git clean -dfxq"
and a full rebuild. It smells like different files getting out of
sync about the representation of ExprEvalOp or ExprEvalStep.

regards, tom lane

#3Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Tom Lane (#2)
Re: Failed assertion with jit enabled

Hi,

On Wed, Feb 05, 2025 at 10:51:17AM -0500, Tom Lane wrote:

Bertrand Drouvot <bertranddrouvot.pg@gmail.com> writes:

I was doing some tests and managed to trigger a failed assertion with jit
enabled.

The test can be simplified to:

postgres=# select count(*) from generate_series(1,10000000);
server closed the connection unexpectedly

Hmm, works for me on today's HEAD.

Thanks for looking at it!

The failed assertion is:
TRAP: failed Assert("false"), File: "llvmjit_expr.c", Line: 2833, PID: 3060333

There are two Assert(false) in that file, neither of them particularly
close to line 2833 as of today, so I'm not sure what version you are
using or which one you're hitting. Nonetheless, I'll go out on a
limb and guess that this will go away if you do "git clean -dfxq"
and a full rebuild.

Yeah that's my default way to do and that was done that way.

It smells like different files getting out of
sync about the representation of ExprEvalOp or ExprEvalStep.

I did look more closely (knowing that it works for you) and the issue is linked
to not using --with-llvm. Inded, I used to use --with-llvm but removed it some
time ago for testing.

So the failed build was not using --with-llvm and was relying on an old version
of llvmjit.so (build from the last time I used --with-llvm)...

As a default I also always use "maintainer-clean" but it looks like it does not
remove llvmjit.so from the installation directory: is that a miss?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#4Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Bertrand Drouvot (#3)
Re: Failed assertion with jit enabled

On 2025-Feb-05, Bertrand Drouvot wrote:

As a default I also always use "maintainer-clean" but it looks like it does not
remove llvmjit.so from the installation directory: is that a miss?

Hmm, "make uninstall" is supposed to remove things from the install
directory, but maintainer-clean is not. But if you first configure with
different options and then run uninstall, then it won't remove the
library you had installed previously. Maybe what you really wanted was
to zap the entire install directory.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bertrand Drouvot (#3)
Re: Failed assertion with jit enabled

Bertrand Drouvot <bertranddrouvot.pg@gmail.com> writes:

I did look more closely (knowing that it works for you) and the issue is linked
to not using --with-llvm. Inded, I used to use --with-llvm but removed it some
time ago for testing.
So the failed build was not using --with-llvm and was relying on an old version
of llvmjit.so (build from the last time I used --with-llvm)...

Hmm ... I don't understand why a non-JIT build of the core would
attempt to load llvmjit.so. Seems like trouble waiting to happen,
given how closely coupled the core and JIT are. (The .bc files are
pretty much guaranteed to be out of sync in such a case.)

As a default I also always use "maintainer-clean" but it looks like it does not
remove llvmjit.so from the installation directory: is that a miss?

Our "make install" doesn't attempt to remove any files, and I don't
believe that's customary for anyone else either. It'd be pretty
dangerous when installing into a shared system directory.

My personal development cycle usually includes "rm -rf $INSTALLDIR"
before "make install", if I've done any not-localized changes.

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: Failed assertion with jit enabled

Hi,

On 2025-02-05 13:07:58 -0500, Tom Lane wrote:

Bertrand Drouvot <bertranddrouvot.pg@gmail.com> writes:

I did look more closely (knowing that it works for you) and the issue is linked
to not using --with-llvm. Inded, I used to use --with-llvm but removed it some
time ago for testing.
So the failed build was not using --with-llvm and was relying on an old version
of llvmjit.so (build from the last time I used --with-llvm)...

Hmm ... I don't understand why a non-JIT build of the core would
attempt to load llvmjit.so. Seems like trouble waiting to happen,
given how closely coupled the core and JIT are. (The .bc files are
pretty much guaranteed to be out of sync in such a case.)

To support a) packaging postgres with split-out JIT support b) pluggability of
JIT backends, the only way we detect if JIT is supported is by trying to load
the configured JIT backend (jit_provider GUC).

Greetings,

Andres Freund

#7Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Álvaro Herrera (#4)
Re: Failed assertion with jit enabled

Hi,

On Wed, Feb 05, 2025 at 06:58:54PM +0100, �lvaro Herrera wrote:

On 2025-Feb-05, Bertrand Drouvot wrote:

As a default I also always use "maintainer-clean" but it looks like it does not
remove llvmjit.so from the installation directory: is that a miss?

Hmm, "make uninstall" is supposed to remove things from the install
directory, but maintainer-clean is not.

Yeah, I can see that maintainer-clean does remove some stuff in the install
directory but is not exhaustive (so not a valid option).

But if you first configure with
different options and then run uninstall, then it won't remove the
library you had installed previously. Maybe what you really wanted was
to zap the entire install directory.

Yeap, will do that now (as also suggested by Tom).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com