Tips on troubleshooting slow DELETE (suspect cascades)
Hi,
I have a slow (CPU bound) DELETE statement I'm attempting to debug and I
suspect that its actually the ON DELETE CASCADE on the foreign key thats
causing it. I suspect this because the dry-run mode of the same query (a
SELECT instead of DELETE) doesn't suffer the same fate. The statement is
effectively;
# Dry mode
SELECT prune_function(timestamp);
# Destructive mode
DELETE FROM foobar p USING prune_function(timestamp) AS e WHERE p.id =
e.prune_id
The logs seem to hold no information on the progress of the statement but
the CPU is pegged at 100% for hours. The SELECT equivalent runs in under a
minute.
What I need is a way to see into this statement as it executes to confirm
my suspicion - does anyone have any tips on that?
Cheers
Jim
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London
On 1/16/24 09:45, Jim Vanns wrote:
Hi,
I have a slow (CPU bound) DELETE statement I'm attempting to debug and I
suspect that its actually the ON DELETE CASCADE on the foreign key thats
causing it. I suspect this because the dry-run mode of the same query (a
SELECT instead of DELETE) doesn't suffer the same fate. The statement is
effectively;# Dry mode
SELECT prune_function(timestamp);
# Destructive mode
DELETE FROM foobar p USING prune_function(timestamp) AS e WHERE p.id
<http://p.id> = e.prune_idThe logs seem to hold no information on the progress of the statement
but the CPU is pegged at 100% for hours. The SELECT equivalent runs in
under a minute.What I need is a way to see into this statement as it executes to
confirm my suspicion - does anyone have any tips on that?
Explain:
https://www.postgresql.org/docs/current/sql-explain.html
It would also be helpful to reply with the table definitions for the
tables. If that is not possible then at least whether there is an index
on the FK reference in the child table(s)?
Cheers
Jim
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London
--
Adrian Klaver
adrian.klaver@aklaver.com
Adrian Klaver <adrian.klaver@aklaver.com> writes:
On 1/16/24 09:45, Jim Vanns wrote:
I have a slow (CPU bound) DELETE statement I'm attempting to debug and I
suspect that its actually the ON DELETE CASCADE on the foreign key thats
causing it.
99% of the time, the cause is lack of an index on the foreign key's
referencing columns. We make you have a unique index on the
referenced columns, because otherwise the FK constraint's semantics
are unclear. But you're not required to make one on the other side.
What I need is a way to see into this statement as it executes to
confirm my suspicion - does anyone have any tips on that?
Explain:
https://www.postgresql.org/docs/current/sql-explain.html
Specifically, if EXPLAIN ANALYZE shows a lot of time spent in the
enforcement trigger for the FK, this is likely what's happening.
regards, tom lane
Hi Tom/Adrian.
I should have already stated I did begin with EXPLAIN but given they
don't easily work with (the internals) stored/procedures, it wasn't
useful in this case. Also, I keep having to terminate the statement
because it never runs to completion and produces the plan (at least in
ANALYZE VERBOSE mode anyway).
I have, however, pulled the function body code out and produced an
isolated case that can be EXPLAINED. The table in question is a
curious one since it models a hierarchy as an adjacency list and so
the fkey reference is back to itself (to a primary key - so is an
additional index required?):
CREATE TABLE tree (
ts TIMESTAMPTZ NOT NULL
tree_id BIGINT NOT NULL,
parent_id BIGINT NULL,
--
CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id),
FOREIGN KEY (parent_id) REFERENCES tree(tree_id)
ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);
CREATE INDEX ON tree USING BRIN (ts);
The tree table has 95,915,630 rows.
I've not yet got a complete or reliable plan :( I have made a DB copy
and will be dropping the constraint to see what effect that has.
Cheers,
Jim
On Tue, 16 Jan 2024 at 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
On 1/16/24 09:45, Jim Vanns wrote:
I have a slow (CPU bound) DELETE statement I'm attempting to debug and I
suspect that its actually the ON DELETE CASCADE on the foreign key thats
causing it.99% of the time, the cause is lack of an index on the foreign key's
referencing columns. We make you have a unique index on the
referenced columns, because otherwise the FK constraint's semantics
are unclear. But you're not required to make one on the other side.What I need is a way to see into this statement as it executes to
confirm my suspicion - does anyone have any tips on that?Explain:
https://www.postgresql.org/docs/current/sql-explain.htmlSpecifically, if EXPLAIN ANALYZE shows a lot of time spent in the
enforcement trigger for the FK, this is likely what's happening.regards, tom lane
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London
On 1/18/24 08:37, Jim Vanns wrote:
Hi Tom/Adrian.
I should have already stated I did begin with EXPLAIN but given they
don't easily work with (the internals) stored/procedures, it wasn't
useful in this case. Also, I keep having to terminate the statement
because it never runs to completion and produces the plan (at least in
ANALYZE VERBOSE mode anyway).I have, however, pulled the function body code out and produced an
isolated case that can be EXPLAINED. The table in question is a
I'm guessing the function is prune_function(timestamp)?
What is the function body code?
curious one since it models a hierarchy as an adjacency list and so
the fkey reference is back to itself (to a primary key - so is an
additional index required?):CREATE TABLE tree (
ts TIMESTAMPTZ NOT NULL
tree_id BIGINT NOT NULL,
parent_id BIGINT NULL,
--
CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id),
FOREIGN KEY (parent_id) REFERENCES tree(tree_id)
ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);
CREATE INDEX ON tree USING BRIN (ts);The tree table has 95,915,630 rows.
I've not yet got a complete or reliable plan :( I have made a DB copy
and will be dropping the constraint to see what effect that has.Cheers,
--
Adrian Klaver
adrian.klaver@aklaver.com
On Jan 18, 2024, at 9:46 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 1/18/24 08:37, Jim Vanns wrote:
Hi Tom/Adrian.
I should have already stated I did begin with EXPLAIN but given they
don't easily work with (the internals) stored/procedures, it wasn't
useful in this case. Also, I keep having to terminate the statement
because it never runs to completion and produces the plan (at least in
ANALYZE VERBOSE mode anyway).
I have, however, pulled the function body code out and produced an
isolated case that can be EXPLAINED. The table in question is aI'm guessing the function is prune_function(timestamp)?
What is the function body code?
curious one since it models a hierarchy as an adjacency list and so
the fkey reference is back to itself (to a primary key - so is an
additional index required?):
CREATE TABLE tree (
ts TIMESTAMPTZ NOT NULL
tree_id BIGINT NOT NULL,
parent_id BIGINT NULL,
--
CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id),
FOREIGN KEY (parent_id) REFERENCES tree(tree_id)
ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);
CREATE INDEX ON tree USING BRIN (ts);
The tree table has 95,915,630 rows.
I've not yet got a complete or reliable plan :( I have made a DB copy
and will be dropping the constraint to see what effect that has.
Cheers,
Have you tried making a table from prune_function() and using that in the delete statement?
Show quoted text
--
Adrian Klaver
adrian.klaver@aklaver.com
Jim Vanns <jvanns@ilm.com> writes:
I should have already stated I did begin with EXPLAIN but given they
don't easily work with (the internals) stored/procedures, it wasn't
useful in this case.
auto_explain with nested statements enabled might help?
regards, tom lane
After dropping the constraint entirely the DELETE completes in 4
minutes (the same time as the dry-run using SELECT against the
function instead of a DELETE). A marked improvement on 3 hours
followed by a pg_cancel_backend()!
Jim
On Thu, 18 Jan 2024 at 16:37, Jim Vanns <jvanns@ilm.com> wrote:
Hi Tom/Adrian.
I should have already stated I did begin with EXPLAIN but given they
don't easily work with (the internals) stored/procedures, it wasn't
useful in this case. Also, I keep having to terminate the statement
because it never runs to completion and produces the plan (at least in
ANALYZE VERBOSE mode anyway).I have, however, pulled the function body code out and produced an
isolated case that can be EXPLAINED. The table in question is a
curious one since it models a hierarchy as an adjacency list and so
the fkey reference is back to itself (to a primary key - so is an
additional index required?):CREATE TABLE tree (
ts TIMESTAMPTZ NOT NULL
tree_id BIGINT NOT NULL,
parent_id BIGINT NULL,
--
CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id),
FOREIGN KEY (parent_id) REFERENCES tree(tree_id)
ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);
CREATE INDEX ON tree USING BRIN (ts);The tree table has 95,915,630 rows.
I've not yet got a complete or reliable plan :( I have made a DB copy
and will be dropping the constraint to see what effect that has.Cheers,
Jim
On Tue, 16 Jan 2024 at 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
On 1/16/24 09:45, Jim Vanns wrote:
I have a slow (CPU bound) DELETE statement I'm attempting to debug and I
suspect that its actually the ON DELETE CASCADE on the foreign key thats
causing it.99% of the time, the cause is lack of an index on the foreign key's
referencing columns. We make you have a unique index on the
referenced columns, because otherwise the FK constraint's semantics
are unclear. But you're not required to make one on the other side.What I need is a way to see into this statement as it executes to
confirm my suspicion - does anyone have any tips on that?Explain:
https://www.postgresql.org/docs/current/sql-explain.htmlSpecifically, if EXPLAIN ANALYZE shows a lot of time spent in the
enforcement trigger for the FK, this is likely what's happening.regards, tom lane
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London
Add an index to parent_id. What is likely happening is each time a row is
deleted, it has to scan the entire table to make sure it is not referenced
by any parent_id records.
On Thu, Jan 18, 2024 at 12:04 PM Jim Vanns <jvanns@ilm.com> wrote:
Show quoted text
After dropping the constraint entirely the DELETE completes in 4
minutes (the same time as the dry-run using SELECT against the
function instead of a DELETE). A marked improvement on 3 hours
followed by a pg_cancel_backend()!Jim
On Thu, 18 Jan 2024 at 16:37, Jim Vanns <jvanns@ilm.com> wrote:
Hi Tom/Adrian.
I should have already stated I did begin with EXPLAIN but given they
don't easily work with (the internals) stored/procedures, it wasn't
useful in this case. Also, I keep having to terminate the statement
because it never runs to completion and produces the plan (at least in
ANALYZE VERBOSE mode anyway).I have, however, pulled the function body code out and produced an
isolated case that can be EXPLAINED. The table in question is a
curious one since it models a hierarchy as an adjacency list and so
the fkey reference is back to itself (to a primary key - so is an
additional index required?):CREATE TABLE tree (
ts TIMESTAMPTZ NOT NULL
tree_id BIGINT NOT NULL,
parent_id BIGINT NULL,
--
CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id),
FOREIGN KEY (parent_id) REFERENCES tree(tree_id)
ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);
CREATE INDEX ON tree USING BRIN (ts);The tree table has 95,915,630 rows.
I've not yet got a complete or reliable plan :( I have made a DB copy
and will be dropping the constraint to see what effect that has.Cheers,
Jim
On Tue, 16 Jan 2024 at 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
On 1/16/24 09:45, Jim Vanns wrote:
I have a slow (CPU bound) DELETE statement I'm attempting to debug
and I
suspect that its actually the ON DELETE CASCADE on the foreign key
thats
causing it.
99% of the time, the cause is lack of an index on the foreign key's
referencing columns. We make you have a unique index on the
referenced columns, because otherwise the FK constraint's semantics
are unclear. But you're not required to make one on the other side.What I need is a way to see into this statement as it executes to
confirm my suspicion - does anyone have any tips on that?Explain:
https://www.postgresql.org/docs/current/sql-explain.htmlSpecifically, if EXPLAIN ANALYZE shows a lot of time spent in the
enforcement trigger for the FK, this is likely what's happening.regards, tom lane
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London