fairywren exiting in ecpg
Hi,
Looks like fairywren is possibly seeing something I saw before and spent many
days looking into:
/messages/by-id/20220909235836.lz3igxtkcjb5w7zb@awork3.anarazel.de
which led me to add the following to .cirrus.yml:
# Cirrus defaults to SetErrorMode(SEM_NOGPFAULTERRORBOX | ...). That
# prevents crash reporting from working unless binaries do SetErrorMode()
# themselves. Furthermore, it appears that either python or, more likely,
# the C runtime has a bug where SEM_NOGPFAULTERRORBOX can very
# occasionally *trigger* a crash on process exit - which is hard to debug,
# given that it explicitly prevents crash dumps from working...
# 0x8001 is SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX
CIRRUS_WINDOWS_ERROR_MODE: 0x8001
The mingw folks also spent a lot of time looking into this ([1]https://github.com/msys2/MINGW-packages/issues/11864), without a
lot of success.
It sure looks like it might be a windows C runtime issue - none of the
stacktrace handling python has gets invoked. I could not find any relevant
behavoural differences in python's code that depend on SEM_NOGPFAULTERRORBOX
being set.
It'd be interesting to see if fairywren's occasional failures go away if you
set MSYS=winjitdebug (which prevents msys from adding SEM_NOGPFAULTERRORBOX to
ErrorMode).
Greetings,
Andres Freund
On 2023-04-03 Mo 21:15, Andres Freund wrote:
Hi,
Looks like fairywren is possibly seeing something I saw before and spent many
days looking into:
/messages/by-id/20220909235836.lz3igxtkcjb5w7zb@awork3.anarazel.de
which led me to add the following to .cirrus.yml:# Cirrus defaults to SetErrorMode(SEM_NOGPFAULTERRORBOX | ...). That
# prevents crash reporting from working unless binaries do SetErrorMode()
# themselves. Furthermore, it appears that either python or, more likely,
# the C runtime has a bug where SEM_NOGPFAULTERRORBOX can very
# occasionally *trigger* a crash on process exit - which is hard to debug,
# given that it explicitly prevents crash dumps from working...
# 0x8001 is SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX
CIRRUS_WINDOWS_ERROR_MODE: 0x8001The mingw folks also spent a lot of time looking into this ([1]), without a
lot of success.It sure looks like it might be a windows C runtime issue - none of the
stacktrace handling python has gets invoked. I could not find any relevant
behavoural differences in python's code that depend on SEM_NOGPFAULTERRORBOX
being set.It'd be interesting to see if fairywren's occasional failures go away if you
set MSYS=winjitdebug (which prevents msys from adding SEM_NOGPFAULTERRORBOX to
ErrorMode).
trying now. Since this happened every build or so it shouldn't take long
for us to see.
(I didn't see anything in the MSYS2 docs that specified the possible
values for MSYS :-( )
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
On 2023-04-04 Tu 08:22, Andrew Dunstan wrote:
On 2023-04-03 Mo 21:15, Andres Freund wrote:
Hi,
Looks like fairywren is possibly seeing something I saw before and spent many
days looking into:
/messages/by-id/20220909235836.lz3igxtkcjb5w7zb@awork3.anarazel.de
which led me to add the following to .cirrus.yml:# Cirrus defaults to SetErrorMode(SEM_NOGPFAULTERRORBOX | ...). That
# prevents crash reporting from working unless binaries do SetErrorMode()
# themselves. Furthermore, it appears that either python or, more likely,
# the C runtime has a bug where SEM_NOGPFAULTERRORBOX can very
# occasionally *trigger* a crash on process exit - which is hard to debug,
# given that it explicitly prevents crash dumps from working...
# 0x8001 is SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX
CIRRUS_WINDOWS_ERROR_MODE: 0x8001The mingw folks also spent a lot of time looking into this ([1]), without a
lot of success.It sure looks like it might be a windows C runtime issue - none of the
stacktrace handling python has gets invoked. I could not find any relevant
behavoural differences in python's code that depend on SEM_NOGPFAULTERRORBOX
being set.It'd be interesting to see if fairywren's occasional failures go away if you
set MSYS=winjitdebug (which prevents msys from adding SEM_NOGPFAULTERRORBOX to
ErrorMode).trying now. Since this happened every build or so it shouldn't take
long for us to see.(I didn't see anything in the MSYS2 docs that specified the possible
values for MSYS :-( )
The error hasn't been seen since I set this about a week ago.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
Hi,
On 2023-04-11 07:10:20 -0400, Andrew Dunstan wrote:
The error hasn't been seen since I set this about a week ago.
This issue really bothers me, but I am at my wits end how to debug it, given
that we get a segfault only if we *disable* getting crash reports / core dumps
in some form. There's no debug printout or anything, python just exits with an
error code indicating an access violation.
Greetings,
Andres Freund