Thread: Optimizing DELETE
I've just fired off a "DELETE FROM table" command (i.e. unfiltered DELETE) on a trivially small table but with many foreign key references (on similar-sized tables), and I'm waiting for it to finish. It's been 10 minutes now, which seems very excessive for a table of 9000 rows on a 3 GHz desktop machine. 'top' says it's all spent in USER time, and there's a ~~500KB/s write rate going on. Just before this DELETE, I've deleted data from a larger table (50000 rows) using the same method and it finished in couple of seconds - maybe it's a PostgreSQL bug? My question is: assuming it's not a bug, how to optimize DELETEs? Increasing work_mem maybe? (I'm using PostgreSQL 8.1.4 on FreeBSD 6- amd64) (I know about TRUNCATE; I need those foreign key references to cascade)
> I've just fired off a "DELETE FROM table" command (i.e. unfiltered > DELETE) on a trivially small table but with many foreign key references > (on similar-sized tables), and I'm waiting for it to finish. It's been > 10 minutes now, which seems very excessive for a table of 9000 rows on a > 3 GHz desktop machine. If you have missing indexes on the child tables foreign keys, that might be a cause of slow delete. The cascading delete must look up the to be deleted rows in all child tables, which will do sequential scans if you don't have proper indexes. Try to do an explain analyze for deleting one row, that should also show you the time spent in triggers, which might clue you in what's taking so long. Cheers, Csaba.
On Tue, 2006-09-19 at 15:22 +0200, Ivan Voras wrote: > I've just fired off a "DELETE FROM table" command (i.e. unfiltered > DELETE) on a trivially small table but with many foreign key references > (on similar-sized tables), and I'm waiting for it to finish. It's been > 10 minutes now, which seems very excessive for a table of 9000 rows on a > 3 GHz desktop machine. I would guess that a few of those referenced tables are missing indexes on the referenced column. > 'top' says it's all spent in USER time, and there's a ~~500KB/s write > rate going on. Just before this DELETE, I've deleted data from a larger > table (50000 rows) using the same method and it finished in couple of > seconds - maybe it's a PostgreSQL bug? > > My question is: assuming it's not a bug, how to optimize DELETEs? > Increasing work_mem maybe? > > (I'm using PostgreSQL 8.1.4 on FreeBSD 6- amd64) > > (I know about TRUNCATE; I need those foreign key references to cascade) > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
You do not have indexes on all of the columns which are linked by foreign key constraints. For example, let's say that I had a "scientist" table with a single column "scientist_name" and another table "discovery" which had "scientist_name" as a column with a foreign key constraint to the "scientist" table. If the system were to try to delete a row from the scientist table, then it would need to scan the discovery table for any row which referenced that scientist_name. If there is an index on the scientist_name column in the discovery table, this is a fast operation. In your case however, there most likely isn't an index on that column, so it needs to do a full table scan of the discovery table for each row deleted from the scientist table. If the discovery table has 100,000 rows, and there are 100 scientists, then deleting those 100 scientists would require scanning 100,000 * 100 = 10M records, so this sort of thing can quickly become a very expensive operation. Because of this potential for truly atrocious update/delete behavior, some database systems (SQL Server at least, and IIRC Oracle as well) either automatically create the index on discovery.scientist_name when the foreign key constraint is created, or refuse to create the foreign key constraint if there isn't already an index. PG doesn't force you to have an index, which can be desirable for performance reasons in some situations if you know what you're doing, but allows you to royally shoot yourself in the foot on deletes/updates to the parent table if you're not careful. If you have a lot of constraints and want to track down which one is unindexed, then doing an EXPLAIN ANALYZE of deleting a single row from the parent table will tell you how long each of the referential integrity checks takes, so you can figure out which indexes are missing. -- Mark Lewis On Tue, 2006-09-19 at 15:22 +0200, Ivan Voras wrote: > I've just fired off a "DELETE FROM table" command (i.e. unfiltered > DELETE) on a trivially small table but with many foreign key references > (on similar-sized tables), and I'm waiting for it to finish. It's been > 10 minutes now, which seems very excessive for a table of 9000 rows on a > 3 GHz desktop machine. > > 'top' says it's all spent in USER time, and there's a ~~500KB/s write > rate going on. Just before this DELETE, I've deleted data from a larger > table (50000 rows) using the same method and it finished in couple of > seconds - maybe it's a PostgreSQL bug? > > My question is: assuming it's not a bug, how to optimize DELETEs? > Increasing work_mem maybe? > > (I'm using PostgreSQL 8.1.4 on FreeBSD 6- amd64) > > (I know about TRUNCATE; I need those foreign key references to cascade) > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend
Rod Taylor wrote: > On Tue, 2006-09-19 at 15:22 +0200, Ivan Voras wrote: >> I've just fired off a "DELETE FROM table" command (i.e. unfiltered >> DELETE) on a trivially small table but with many foreign key references >> (on similar-sized tables), and I'm waiting for it to finish. It's been >> 10 minutes now, which seems very excessive for a table of 9000 rows on a >> 3 GHz desktop machine. > > I would guess that a few of those referenced tables are missing indexes > on the referenced column. Yes, it was a pilot error :( Among the small and properly indexed referencing tables there was a seldom queried but huge log table.