Thread: Tips on troubleshooting slow DELETE (suspect cascades)
Hi,
--
I have a slow (CPU bound) DELETE statement I'm attempting to debug and I suspect that its actually the ON DELETE CASCADE on the foreign key thats causing it. I suspect this because the dry-run mode of the same query (a SELECT instead of DELETE) doesn't suffer the same fate. The statement is effectively;
# Dry mode
SELECT prune_function(timestamp);
# Destructive mode
DELETE FROM foobar p USING prune_function(timestamp) AS e WHERE p.id = e.prune_id
The logs seem to hold no information on the progress of the statement but the CPU is pegged at 100% for hours. The SELECT equivalent runs in under a minute.
What I need is a way to see into this statement as it executes to confirm my suspicion - does anyone have any tips on that?
Cheers
Jim
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London
Principal Production Engineer
Industrial Light & Magic, London
On 1/16/24 09:45, Jim Vanns wrote: > Hi, > > I have a slow (CPU bound) DELETE statement I'm attempting to debug and I > suspect that its actually the ON DELETE CASCADE on the foreign key thats > causing it. I suspect this because the dry-run mode of the same query (a > SELECT instead of DELETE) doesn't suffer the same fate. The statement is > effectively; > > # Dry mode > SELECT prune_function(timestamp); > # Destructive mode > DELETE FROM foobar p USING prune_function(timestamp) AS e WHERE p.id > <http://p.id> = e.prune_id > > The logs seem to hold no information on the progress of the statement > but the CPU is pegged at 100% for hours. The SELECT equivalent runs in > under a minute. > > What I need is a way to see into this statement as it executes to > confirm my suspicion - does anyone have any tips on that? Explain: https://www.postgresql.org/docs/current/sql-explain.html It would also be helpful to reply with the table definitions for the tables. If that is not possible then at least whether there is an index on the FK reference in the child table(s)? > > Cheers > > Jim > > -- > Jim Vanns > Principal Production Engineer > Industrial Light & Magic, London -- Adrian Klaver adrian.klaver@aklaver.com
Adrian Klaver <adrian.klaver@aklaver.com> writes: > On 1/16/24 09:45, Jim Vanns wrote: >> I have a slow (CPU bound) DELETE statement I'm attempting to debug and I >> suspect that its actually the ON DELETE CASCADE on the foreign key thats >> causing it. 99% of the time, the cause is lack of an index on the foreign key's referencing columns. We make you have a unique index on the referenced columns, because otherwise the FK constraint's semantics are unclear. But you're not required to make one on the other side. >> What I need is a way to see into this statement as it executes to >> confirm my suspicion - does anyone have any tips on that? > Explain: > https://www.postgresql.org/docs/current/sql-explain.html Specifically, if EXPLAIN ANALYZE shows a lot of time spent in the enforcement trigger for the FK, this is likely what's happening. regards, tom lane
Hi Tom/Adrian. I should have already stated I did begin with EXPLAIN but given they don't easily work with (the internals) stored/procedures, it wasn't useful in this case. Also, I keep having to terminate the statement because it never runs to completion and produces the plan (at least in ANALYZE VERBOSE mode anyway). I have, however, pulled the function body code out and produced an isolated case that can be EXPLAINED. The table in question is a curious one since it models a hierarchy as an adjacency list and so the fkey reference is back to itself (to a primary key - so is an additional index required?): CREATE TABLE tree ( ts TIMESTAMPTZ NOT NULL tree_id BIGINT NOT NULL, parent_id BIGINT NULL, -- CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id), FOREIGN KEY (parent_id) REFERENCES tree(tree_id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED ); CREATE INDEX ON tree USING BRIN (ts); The tree table has 95,915,630 rows. I've not yet got a complete or reliable plan :( I have made a DB copy and will be dropping the constraint to see what effect that has. Cheers, Jim On Tue, 16 Jan 2024 at 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Adrian Klaver <adrian.klaver@aklaver.com> writes: > > On 1/16/24 09:45, Jim Vanns wrote: > >> I have a slow (CPU bound) DELETE statement I'm attempting to debug and I > >> suspect that its actually the ON DELETE CASCADE on the foreign key thats > >> causing it. > > 99% of the time, the cause is lack of an index on the foreign key's > referencing columns. We make you have a unique index on the > referenced columns, because otherwise the FK constraint's semantics > are unclear. But you're not required to make one on the other side. > > >> What I need is a way to see into this statement as it executes to > >> confirm my suspicion - does anyone have any tips on that? > > > Explain: > > https://www.postgresql.org/docs/current/sql-explain.html > > Specifically, if EXPLAIN ANALYZE shows a lot of time spent in the > enforcement trigger for the FK, this is likely what's happening. > > regards, tom lane -- Jim Vanns Principal Production Engineer Industrial Light & Magic, London
On 1/18/24 08:37, Jim Vanns wrote: > Hi Tom/Adrian. > > I should have already stated I did begin with EXPLAIN but given they > don't easily work with (the internals) stored/procedures, it wasn't > useful in this case. Also, I keep having to terminate the statement > because it never runs to completion and produces the plan (at least in > ANALYZE VERBOSE mode anyway). > > I have, however, pulled the function body code out and produced an > isolated case that can be EXPLAINED. The table in question is a I'm guessing the function is prune_function(timestamp)? What is the function body code? > curious one since it models a hierarchy as an adjacency list and so > the fkey reference is back to itself (to a primary key - so is an > additional index required?): > > CREATE TABLE tree ( > ts TIMESTAMPTZ NOT NULL > tree_id BIGINT NOT NULL, > parent_id BIGINT NULL, > -- > CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id), > FOREIGN KEY (parent_id) REFERENCES tree(tree_id) > ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED > ); > CREATE INDEX ON tree USING BRIN (ts); > > The tree table has 95,915,630 rows. > > I've not yet got a complete or reliable plan :( I have made a DB copy > and will be dropping the constraint to see what effect that has. > > Cheers, > -- Adrian Klaver adrian.klaver@aklaver.com
> On Jan 18, 2024, at 9:46 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > On 1/18/24 08:37, Jim Vanns wrote: >> Hi Tom/Adrian. >> I should have already stated I did begin with EXPLAIN but given they >> don't easily work with (the internals) stored/procedures, it wasn't >> useful in this case. Also, I keep having to terminate the statement >> because it never runs to completion and produces the plan (at least in >> ANALYZE VERBOSE mode anyway). >> I have, however, pulled the function body code out and produced an >> isolated case that can be EXPLAINED. The table in question is a > > I'm guessing the function is prune_function(timestamp)? > > What is the function body code? > >> curious one since it models a hierarchy as an adjacency list and so >> the fkey reference is back to itself (to a primary key - so is an >> additional index required?): >> CREATE TABLE tree ( >> ts TIMESTAMPTZ NOT NULL >> tree_id BIGINT NOT NULL, >> parent_id BIGINT NULL, >> -- >> CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id), >> FOREIGN KEY (parent_id) REFERENCES tree(tree_id) >> ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED >> ); >> CREATE INDEX ON tree USING BRIN (ts); >> The tree table has 95,915,630 rows. >> I've not yet got a complete or reliable plan :( I have made a DB copy >> and will be dropping the constraint to see what effect that has. >> Cheers, > Have you tried making a table from prune_function() and using that in the delete statement? > > -- > Adrian Klaver > adrian.klaver@aklaver.com > > >
Jim Vanns <jvanns@ilm.com> writes: > I should have already stated I did begin with EXPLAIN but given they > don't easily work with (the internals) stored/procedures, it wasn't > useful in this case. auto_explain with nested statements enabled might help? regards, tom lane
After dropping the constraint entirely the DELETE completes in 4 minutes (the same time as the dry-run using SELECT against the function instead of a DELETE). A marked improvement on 3 hours followed by a pg_cancel_backend()! Jim On Thu, 18 Jan 2024 at 16:37, Jim Vanns <jvanns@ilm.com> wrote: > > Hi Tom/Adrian. > > I should have already stated I did begin with EXPLAIN but given they > don't easily work with (the internals) stored/procedures, it wasn't > useful in this case. Also, I keep having to terminate the statement > because it never runs to completion and produces the plan (at least in > ANALYZE VERBOSE mode anyway). > > I have, however, pulled the function body code out and produced an > isolated case that can be EXPLAINED. The table in question is a > curious one since it models a hierarchy as an adjacency list and so > the fkey reference is back to itself (to a primary key - so is an > additional index required?): > > CREATE TABLE tree ( > ts TIMESTAMPTZ NOT NULL > tree_id BIGINT NOT NULL, > parent_id BIGINT NULL, > -- > CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id), > FOREIGN KEY (parent_id) REFERENCES tree(tree_id) > ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED > ); > CREATE INDEX ON tree USING BRIN (ts); > > The tree table has 95,915,630 rows. > > I've not yet got a complete or reliable plan :( I have made a DB copy > and will be dropping the constraint to see what effect that has. > > Cheers, > > Jim > > On Tue, 16 Jan 2024 at 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Adrian Klaver <adrian.klaver@aklaver.com> writes: > > > On 1/16/24 09:45, Jim Vanns wrote: > > >> I have a slow (CPU bound) DELETE statement I'm attempting to debug and I > > >> suspect that its actually the ON DELETE CASCADE on the foreign key thats > > >> causing it. > > > > 99% of the time, the cause is lack of an index on the foreign key's > > referencing columns. We make you have a unique index on the > > referenced columns, because otherwise the FK constraint's semantics > > are unclear. But you're not required to make one on the other side. > > > > >> What I need is a way to see into this statement as it executes to > > >> confirm my suspicion - does anyone have any tips on that? > > > > > Explain: > > > https://www.postgresql.org/docs/current/sql-explain.html > > > > Specifically, if EXPLAIN ANALYZE shows a lot of time spent in the > > enforcement trigger for the FK, this is likely what's happening. > > > > regards, tom lane > > > > -- > Jim Vanns > Principal Production Engineer > Industrial Light & Magic, London -- Jim Vanns Principal Production Engineer Industrial Light & Magic, London
Add an index to parent_id. What is likely happening is each time a row is deleted, it has to scan the entire table to make sure it is not referenced by any parent_id records.
On Thu, Jan 18, 2024 at 12:04 PM Jim Vanns <jvanns@ilm.com> wrote:
After dropping the constraint entirely the DELETE completes in 4
minutes (the same time as the dry-run using SELECT against the
function instead of a DELETE). A marked improvement on 3 hours
followed by a pg_cancel_backend()!
Jim
On Thu, 18 Jan 2024 at 16:37, Jim Vanns <jvanns@ilm.com> wrote:
>
> Hi Tom/Adrian.
>
> I should have already stated I did begin with EXPLAIN but given they
> don't easily work with (the internals) stored/procedures, it wasn't
> useful in this case. Also, I keep having to terminate the statement
> because it never runs to completion and produces the plan (at least in
> ANALYZE VERBOSE mode anyway).
>
> I have, however, pulled the function body code out and produced an
> isolated case that can be EXPLAINED. The table in question is a
> curious one since it models a hierarchy as an adjacency list and so
> the fkey reference is back to itself (to a primary key - so is an
> additional index required?):
>
> CREATE TABLE tree (
> ts TIMESTAMPTZ NOT NULL
> tree_id BIGINT NOT NULL,
> parent_id BIGINT NULL,
> --
> CONSTRAINT cstr_tree_pky PRIMARY KEY (tree_id) INCLUDE (parent_id),
> FOREIGN KEY (parent_id) REFERENCES tree(tree_id)
> ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
> );
> CREATE INDEX ON tree USING BRIN (ts);
>
> The tree table has 95,915,630 rows.
>
> I've not yet got a complete or reliable plan :( I have made a DB copy
> and will be dropping the constraint to see what effect that has.
>
> Cheers,
>
> Jim
>
> On Tue, 16 Jan 2024 at 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Adrian Klaver <adrian.klaver@aklaver.com> writes:
> > > On 1/16/24 09:45, Jim Vanns wrote:
> > >> I have a slow (CPU bound) DELETE statement I'm attempting to debug and I
> > >> suspect that its actually the ON DELETE CASCADE on the foreign key thats
> > >> causing it.
> >
> > 99% of the time, the cause is lack of an index on the foreign key's
> > referencing columns. We make you have a unique index on the
> > referenced columns, because otherwise the FK constraint's semantics
> > are unclear. But you're not required to make one on the other side.
> >
> > >> What I need is a way to see into this statement as it executes to
> > >> confirm my suspicion - does anyone have any tips on that?
> >
> > > Explain:
> > > https://www.postgresql.org/docs/current/sql-explain.html
> >
> > Specifically, if EXPLAIN ANALYZE shows a lot of time spent in the
> > enforcement trigger for the FK, this is likely what's happening.
> >
> > regards, tom lane
>
>
>
> --
> Jim Vanns
> Principal Production Engineer
> Industrial Light & Magic, London
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London