Postmaster Crashing - Postgres 11 when JIT is enabled

Started by Mukesh Chhatanialmost 6 years ago3 messagesbugs
Jump to latest
#1Mukesh Chhatani
chhatani.mukesh@gmail.com

Hello Team,

We are experiencing issues around Postmaster crashing due to cancellation
of a simple query

Postgres Version - 11.6
Environment - AWS RDS
Table Size - 50 GB

Few ways I resolved the issue :
1. Vacuum analyze the table and then execute the query and cancel it.
2. Disable JIT
3. Set max_parallel_workers_per_gather = 0

Attached is the file containing DEBUG logs and ERROR we are experiencing
issues with.

Regards,
Mukesh

Attachments:

postgres11_crash_issue.txttext/plain; charset=US-ASCII; name=postgres11_crash_issue.txtDownload
#2Andres Freund
andres@anarazel.de
In reply to: Mukesh Chhatani (#1)
Re: Postmaster Crashing - Postgres 11 when JIT is enabled

Hi,

On 2020-06-09 14:00:06 -0500, Mukesh Chhatani wrote:

We are experiencing issues around Postmaster crashing due to cancellation
of a simple query

Postgres Version - 11.6
Environment - AWS RDS
Table Size - 50 GB

Unfortunately we do not have access to the exact changes RDS has done to
postgres, so we cannot easily debug such problems.

Few ways I resolved the issue :
1. Vacuum analyze the table and then execute the query and cancel it.
2. Disable JIT
3. Set max_parallel_workers_per_gather = 0

Attached is the file containing DEBUG logs and ERROR we are experiencing
issues with.

I am afraid that we need more information to be able to debug this
problem. The best would be a backtrace of the crash, but for that you'd
likely need help from RDS support.

Can you post the exact table definition of the table with the problem?
And an EXPLAIN for the problematic query?

Are you saying that after

1. Vacuum analyze the table and then execute the query and cancel it.

the problem does not occur anymore for future queries?

Greetings,

Andres Freund

#3Mukesh Chhatani
chhatani.mukesh@gmail.com
In reply to: Andres Freund (#2)
Re: Postmaster Crashing - Postgres 11 when JIT is enabled

Thanks for responding to my query.

I have simultaneously opened a bug with AWS too.

Issue happens with large tables (size > 5 GB) and is consistent across
multiple databases. It does not happen when JIT is disabled
or max_parallel_workers_per_gather = 0 is used in a session.

I am not running the query just starting it and then cancelling in between
- using Control + C, which causes postmaster to crash.

Here is the table definition

Column | Type | Collation | Nullable | Default |
Storage | Stats target | Description
---------------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
id | text | | not null |
| extended | |
event | text | | not null |
| extended | |
unique_id | uuid | | not null |
| plain | |
surgate_id | text | | |
| extended | |
user_agent | text | | |
| extended | |
code | text | | |
| extended | |
encrypted. | text | | |
| extended | |
hash | text | | |
| extended | |
start_time | timestamp with time zone | | not null |
| plain | |
expected_expiration | timestamp with time zone | | |
| plain | |
end_time | timestamp with time zone | | |
| plain | |
level | text | | |
| extended | |
Indexes:

"sessions_pkey" PRIMARY KEY, btree (id)
"id_date" btree (unique_id, expected_expiration)
"sessions_rid" btree (unique_id)
"sessions_surrogate_id" btree (surgate_id)

Thanks & Regards,

Mukesh
On Tue, Jun 9, 2020 at 2:11 PM Andres Freund <andres@anarazel.de> wrote:

Show quoted text

Hi,

On 2020-06-09 14:00:06 -0500, Mukesh Chhatani wrote:

We are experiencing issues around Postmaster crashing due to cancellation
of a simple query

Postgres Version - 11.6
Environment - AWS RDS
Table Size - 50 GB

Unfortunately we do not have access to the exact changes RDS has done to
postgres, so we cannot easily debug such problems.

Few ways I resolved the issue :
1. Vacuum analyze the table and then execute the query and cancel it.
2. Disable JIT
3. Set max_parallel_workers_per_gather = 0

Attached is the file containing DEBUG logs and ERROR we are experiencing
issues with.

I am afraid that we need more information to be able to debug this
problem. The best would be a backtrace of the crash, but for that you'd
likely need help from RDS support.

Can you post the exact table definition of the table with the problem?
And an EXPLAIN for the problematic query?

Are you saying that after

1. Vacuum analyze the table and then execute the query and cancel it.

the problem does not occur anymore for future queries?

Greetings,

Andres Freund