BUG #17636: terminating connection because of crash of another server process

Started by PG Bug reporting formover 3 years ago4 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17636
Logged by: Sujeet Swaminath
Email address: sujeet.chaurasia@lisec.com
PostgreSQL version: 12.11
Operating system: Windows
Description:

We are facing this issue with POSTGRES 12.11, only on windows OS, with Linux
OS, it works fine,

if the executable that is using the database session crashes, then the
entire database goes to recovery mode and restarts, in the Postgres log, we
can find the below messages.

"2022-10-06 17:44:09.210 CEST [8860] LOG: server process (PID 9980) exited
with exit code 0
2022-10-06 17:44:09.210 CEST [8860] LOG: terminating any other active
server processes

2022-10-06 17:44:09.211 CEST [9992] WARNING: terminating the connection
because of the crash of another server process

2022-10-06 17:44:09.211 CEST [9992] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit because
another server process exited abnormally and possibly corrupted shared
memory.

2022-10-06 17:44:09.211 CEST [9992] HINT: In a moment you should be able to
reconnect to the database and repeat your command. "

#2Julien Rouhaud
rjuju123@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #17636: terminating connection because of crash of another server process

Hi,

On Wed, Oct 12, 2022 at 11:22:26AM +0000, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17636
Logged by: Sujeet Swaminath
Email address: sujeet.chaurasia@lisec.com
PostgreSQL version: 12.11
Operating system: Windows
Description:

We are facing this issue with POSTGRES 12.11, only on windows OS, with Linux
OS, it works fine,

if the executable that is using the database session crashes, then the
entire database goes to recovery mode and restarts, in the Postgres log, we
can find the below messages.

"2022-10-06 17:44:09.210 CEST [8860] LOG: server process (PID 9980) exited
with exit code 0
2022-10-06 17:44:09.210 CEST [8860] LOG: terminating any other active
server processes

2022-10-06 17:44:09.211 CEST [9992] WARNING: terminating the connection
because of the crash of another server process

2022-10-06 17:44:09.211 CEST [9992] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit because
another server process exited abnormally and possibly corrupted shared
memory.

2022-10-06 17:44:09.211 CEST [9992] HINT: In a moment you should be able to
reconnect to the database and repeat your command. "

This unfortunately isn't enough information to understand what's going on.

First, is the problem still happening if you update to version 12.12?

Also, do you have any extension or modules configured? You haven't shown the
logs corresponding to the original process problem, including the query it was
executing in case of a normal backend.

Can you manually reproduce the problem, and/or get a stack trace of the
problem? See
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
(and possibly
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows#Using_crash_dumps_to_debug_random_and_unpredictable_backend_crashes)
for more details on how to do.

#3Chaurasia Sujeet
Sujeet.Chaurasia@lisec.com
In reply to: Julien Rouhaud (#2)
RE: BUG #17636: terminating connection because of crash of another server process

Hi,

We've upgraded postgres to postgres13, and no extension is configured in postgresql on the server, Yet it again crashed again, and there is no antivirus on the server.

Since the problem happens randomly and we are not able to recreate the problem ourselves. and the log shows no query or details as to what caused the crash. we just get a message that one PID of postgres.exe exited with code 0, which makes it difficult to even troubleshoot.

According to the article you shared, we set up the minidump for the random crash, but we didn't find the dump in the crashdumps directory.

We followed the highlighted part below from this article, as the crash dump handler is already included in the postgresql as per the article. so any crash dump should get generated in the crashdumps directory.

If the crashes appear to be random and you don't know how to trigger them, it's hard to connect a debugger to the problem postgres.exe before it crashes.
Setting your debugger as the JIT (just-in-time) or post-mortem debugger won't help you, because PostgreSQL generally runs as a service under a different user account that cannot interact with the desktop. You could always initdb a new cluster under your normal user account and use pg_ctl to start the postmaster with that cluster manually, so you can JIT debug under your own user account where Pg can interact with the desktop. This isn't suitable for production use, though, and you might not be able to reproduce the problem that way.
In PostgreSQL 9.0 and above there is a crash dump hander<http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/port/win32/crashdump.c;h=7550fa6f26b82d6fc41f5f68afb35ec44d25d00b;hb=HEAD&gt; included in PostgreSQL. To use it:
* Create a directory named crashdumps (all lower case) in the PostgreSQL data directory (as shown by SHOW data_directory; in psql)
* Give the PostgreSQL user (postgres by default) "full control" of it in the security tab of the folder properties
* Run the problem code. You don't need to restart Pg or change any settings.
* When a backend crashes, a Windows minidump should be created in the crashdumps directory.

Please help us to know if there is any other step here to generate a crash dump as the issue is random and we are not aware of the cause that is making postgres.exe crash.

Thanks,
Sujeet

-----Original Message-----
From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Monday, October 17, 2022 7:49 AM
To: Chaurasia Sujeet <Sujeet.Chaurasia@lisec.com>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17636: terminating connection because of crash of another server process

[You don't often get email from mailto:rjuju123@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

Hi,

On Wed, Oct 12, 2022 at 11:22:26AM +0000, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17636
Logged by: Sujeet Swaminath
Email address: mailto:sujeet.chaurasia@lisec.com
PostgreSQL version: 12.11
Operating system: Windows
Description:

We are facing this issue with POSTGRES 12.11, only on windows OS, with
Linux OS, it works fine,

if the executable that is using the database session crashes, then the
entire database goes to recovery mode and restarts, in the Postgres
log, we can find the below messages.

"2022-10-06 17:44:09.210 CEST [8860] LOG: server process (PID 9980)
exited with exit code 0
2022-10-06 17:44:09.210 CEST [8860] LOG: terminating any other active
server processes

2022-10-06 17:44:09.211 CEST [9992] WARNING: terminating the
connection because of the crash of another server process

2022-10-06 17:44:09.211 CEST [9992] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and
exit because another server process exited abnormally and possibly
corrupted shared memory.

2022-10-06 17:44:09.211 CEST [9992] HINT: In a moment you should be
able to reconnect to the database and repeat your command. "

This unfortunately isn't enough information to understand what's going on.

First, is the problem still happening if you update to version 12.12?

Also, do you have any extension or modules configured? You haven't shown the logs corresponding to the original process problem, including the query it was executing in case of a normal backend.

Can you manually reproduce the problem, and/or get a stack trace of the problem? See
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.postgresql.org%2Fwiki%2FGetting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows&amp;amp;data=05%7C01%7Csujeet.chaurasia%40lisec.com%7Cbfbb52a4c2f74f960d2508daaff299c2%7C2c3fb8cad47f42c2b9556186e8077dcf%7C0%7C0%7C638015753749724414%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;amp;sdata=ILHhIL9nKd6zOH%2F%2FPg4OWbXR2nFxTYE28xfnVldWXwU%3D&amp;amp;reserved=0
(and possibly
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.postgresql.org%2Fwiki%2FGetting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows%23Using_crash_dumps_to_debug_random_and_unpredictable_backend_crashes&amp;amp;data=05%7C01%7Csujeet.chaurasia%40lisec.com%7Cbfbb52a4c2f74f960d2508daaff299c2%7C2c3fb8cad47f42c2b9556186e8077dcf%7C0%7C0%7C638015753749724414%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;amp;sdata=t5Km43jcrisd2M8i%2FpLIN5w433dPqvoylbdm42zt9O0%3D&amp;amp;reserved=0)
for more details on how to do.

#4Julien Rouhaud
rjuju123@gmail.com
In reply to: Chaurasia Sujeet (#3)
Re: BUG #17636: terminating connection because of crash of another server process

On Wed, Nov 02, 2022 at 07:13:29AM +0000, Chaurasia Sujeet wrote:

We've upgraded postgres to postgres13, and no extension is configured in
postgresql on the server, Yet it again crashed again, and there is no
antivirus on the server.

Since the problem happens randomly and we are not able to recreate the
problem ourselves. and the log shows no query or details as to what caused
the crash. we just get a message that one PID of postgres.exe exited with
code 0, which makes it difficult to even troubleshoot.

According to the article you shared, we set up the minidump for the random
crash, but we didn't find the dump in the crashdumps directory.

We followed the highlighted part below from this article, as the crash dump
handler is already included in the postgresql as per the article. so any
crash dump should get generated in the crashdumps directory.

If the crashes appear to be random and you don't know how to trigger them,
it's hard to connect a debugger to the problem postgres.exe before it
crashes. Setting your debugger as the JIT (just-in-time) or post-mortem
debugger won't help you, because PostgreSQL generally runs as a service under
a different user account that cannot interact with the desktop. You could
always initdb a new cluster under your normal user account and use pg_ctl to
start the postmaster with that cluster manually, so you can JIT debug under
your own user account where Pg can interact with the desktop. This isn't
suitable for production use, though, and you might not be able to reproduce
the problem that way. In PostgreSQL 9.0 and above there is a crash dump
hander<http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/port/win32/crashdump.c;h=7550fa6f26b82d6fc41f5f68afb35ec44d25d00b;hb=HEAD&gt;
included in PostgreSQL. To use it: * Create a directory named
crashdumps (all lower case) in the PostgreSQL data directory (as shown by
SHOW data_directory; in psql) * Give the PostgreSQL user (postgres by
default) "full control" of it in the security tab of the folder properties *
Run the problem code. You don't need to restart Pg or change any settings. *
When a backend crashes, a Windows minidump should be created in the
crashdumps directory.

Please help us to know if there is any other step here to generate a crash
dump as the issue is random and we are not aware of the cause that is making
postgres.exe crash.

Unfortunately I'm not a Windows user myself, so I have no idea how to generate
a coredump on that platform. If the instructions on the wiki don't work, and
since no one seemed to show up with an answer, maybe ask Microsoft or on a
Windows dedicated forum.