slow DELETE on 12 M row table - Mailing list pgsql-performance

From Janet Jacobsen
Subject slow DELETE on 12 M row table
Date
Msg-id 4A443367.306@lbl.gov
Whole thread Raw
Responses Re: slow DELETE on 12 M row table  (Greg Stark <gsstark@mit.edu>)
List pgsql-performance
Hi.  We are running Postgres 8.3.7 on an eight-processor Linux system.
Because the machine has very little local disk, the database files are on
a file system running GPFS.

The machine is mostly dedicated to processing images.  After the images
are processed, the image attributes and processing parameters are
written to the database.  This is repeated nightly.

Generally, unless image processing is taking place, queries are pretty
fast - ms to seconds.  But deletes are very slow from the largest tables,
which currently have 12 M rows: on the order of four minutes for 60 rows.
We don't have to do a lot of deletions, but do need to be able to do some
from time to time, and generally several thousand at a time.

We also don't have many users - generally no more than one to five
connections at a time.

While waiting for some deletions, I went to search.postgresql.org and
typed "slow delete" in the search field.

Per what I read I tried "explain analyze delete...":
> subtest=> explain analyze delete from table1 where id > 11592550;
>                                                              QUERY
> PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using table1_pkey on table1  (cost=0.00..2565.25 rows=1136
> width=6) (actual time=77.819..107.476 rows=50 loops=1)
>    Index Cond: (id > 11592550)
>  Trigger for constraint table2_table1_id_fkey: time=198484.158 calls=50
>  Total runtime: 198591.894 ms
> (4 rows)
which immediately showed me that I had forgotten about the foreign key in
another table that references the primary key in the table where
I am trying to do the deletions: table2.table1_id -> table1.id.

The posts I read  and the time above suggest that I should create an
index on
the foreign key constraint field in table2 , so I am waiting for that
index to be
created.

My questions are:
(1) is my interpretation of the posts correct, i.e., if I am deleting
rows from
table1, where the pkey of table 1 is a fkey in table 2, then do I need
to create an
index on the fkey field in table 2?
(2) do you have any suggestions on how I can determine why it is taking
several hours to create an index on a field in a table with 12 M rows?  does
that seem like a reasonable amount of time?  I have maintenance_work_mem
set to 512MB - is that too low, or is that the wrong config parameter to
change?
[ps aux shows "CREATE INDEX waiting"; there is nothing (no image processing)
running on the machine at this time]
(3) would I be better off dropping the foreign keys?  in general, is it
workable to
have foreign keys on tables with > 100 M rows (assuming I create all of
the 'right'
indexes)?

Thank you,
Janet







pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Re: Nested Loop "Killer" on 8.1
Next
From: Greg Stark
Date:
Subject: Re: slow DELETE on 12 M row table