Hung Query with No Locking Issues

Started by Michael P. McDonnellalmost 3 years ago4 messagesgeneral
Jump to latest
#1Michael P. McDonnell
bzaks1424@gmail.com

Hey team -
I have 2 stored procedures that need to run back to back. It could
convert to a single one - but it's easier from a maintenance perspective to
keep them separated.

The first procedure effectively is
INSERT INTO table_b () SELECT ____ FROM _table_a_;
COMMIT;

Total execution time - about 180s. Nothing in the pg_locks table and
nothing in the pg_stat_activity table suggests anything is hung over.

The second procedure mutates table_b data into table_b_collapsed
INSERT INTO table_c () SELECT _____ FROM _table_b_ JOIN _table_b as b1_
JOIN _table_b as b2_ JOIN _table_b as b3_, etc...;
COMMIT;

The first time I run my second stored procedure - it hangs for up to 8
hours.
If I immediately cancel and re-run the second stored procedure it runs in 2
seconds.
If I let the query hang and rerun THE EXACT SAME QUERY in another tab of
dbeaver - it runs in 2 seconds.

During the hang - there are no ungranted locks in pg_locks, nothing unusual
in the stats_activity table that I know how to look for. CPU is locked at
100% for the postgres process during the duration of the hung query as well.

Speaking of cpu, this is running on a VMware VM with 8 CPUs and 192 GB of
RAM as a solo VM on a 256 GB / 16 CPU host with SSDs backing the DB locally
on the server (SATA 3 speeds I'm sure) and storage is sitting comfortably
at 20% used and 1% inode utilization.

Not really sure what I'm missing here; I was having these problems even
when they weren't a stored procedure. I was hoping pushing the commit
immediately after the query would solve my problem. Any advice, pointers,
would be appreciated.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael P. McDonnell (#1)
Re: Hung Query with No Locking Issues

"Michael P. McDonnell" <bzaks1424@gmail.com> writes:

I have 2 stored procedures that need to run back to back. It could
convert to a single one - but it's easier from a maintenance perspective to
keep them separated.

The first procedure effectively is
INSERT INTO table_b () SELECT ____ FROM _table_a_;
COMMIT;

Total execution time - about 180s. Nothing in the pg_locks table and
nothing in the pg_stat_activity table suggests anything is hung over.

The second procedure mutates table_b data into table_b_collapsed
INSERT INTO table_c () SELECT _____ FROM _table_b_ JOIN _table_b as b1_
JOIN _table_b as b2_ JOIN _table_b as b3_, etc...;
COMMIT;

The first time I run my second stored procedure - it hangs for up to 8
hours.
If I immediately cancel and re-run the second stored procedure it runs in 2
seconds.

Perhaps an "ANALYZE table_b" in between would help.

regards, tom lane

#3Michael P. McDonnell
bzaks1424@gmail.com
In reply to: Tom Lane (#2)
Re: Hung Query with No Locking Issues

Okay - that worked.
How did you know that would work? That's incredible.

On Sun, May 7, 2023 at 4:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

"Michael P. McDonnell" <bzaks1424@gmail.com> writes:

I have 2 stored procedures that need to run back to back. It could
convert to a single one - but it's easier from a maintenance perspective

to

keep them separated.

The first procedure effectively is
INSERT INTO table_b () SELECT ____ FROM _table_a_;
COMMIT;

Total execution time - about 180s. Nothing in the pg_locks table and
nothing in the pg_stat_activity table suggests anything is hung over.

The second procedure mutates table_b data into table_b_collapsed
INSERT INTO table_c () SELECT _____ FROM _table_b_ JOIN _table_b as b1_
JOIN _table_b as b2_ JOIN _table_b as b3_, etc...;
COMMIT;

The first time I run my second stored procedure - it hangs for up to 8
hours.
If I immediately cancel and re-run the second stored procedure it runs

in 2

seconds.

Perhaps an "ANALYZE table_b" in between would help.

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael P. McDonnell (#3)
Re: Hung Query with No Locking Issues

"Michael P. McDonnell" <bzaks1424@gmail.com> writes:

On Sun, May 7, 2023 at 4:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps an "ANALYZE table_b" in between would help.

Okay - that worked.
How did you know that would work? That's incredible.

Well, I guessed ;-) ... but it was a well-educated guess.

The symptoms sounded like you were getting a bad plan choice for
the second query, which is unsurprising if you populate a table
from empty and don't give auto-analyze enough time to collect
some statistics behind your back.

As a solution going forward, I'd recommend adding the ANALYZE
as the last step in the table-populating procedure.

regards, tom lane