Re: Optimizing DELETE - Mailing list pgsql-performance
From | Mark Lewis |
---|---|
Subject | Re: Optimizing DELETE |
Date | |
Msg-id | 1158675445.9657.1587.camel@archimedes Whole thread Raw |
In response to | Optimizing DELETE (Ivan Voras <ivoras@fer.hr>) |
List | pgsql-performance |
You do not have indexes on all of the columns which are linked by foreign key constraints. For example, let's say that I had a "scientist" table with a single column "scientist_name" and another table "discovery" which had "scientist_name" as a column with a foreign key constraint to the "scientist" table. If the system were to try to delete a row from the scientist table, then it would need to scan the discovery table for any row which referenced that scientist_name. If there is an index on the scientist_name column in the discovery table, this is a fast operation. In your case however, there most likely isn't an index on that column, so it needs to do a full table scan of the discovery table for each row deleted from the scientist table. If the discovery table has 100,000 rows, and there are 100 scientists, then deleting those 100 scientists would require scanning 100,000 * 100 = 10M records, so this sort of thing can quickly become a very expensive operation. Because of this potential for truly atrocious update/delete behavior, some database systems (SQL Server at least, and IIRC Oracle as well) either automatically create the index on discovery.scientist_name when the foreign key constraint is created, or refuse to create the foreign key constraint if there isn't already an index. PG doesn't force you to have an index, which can be desirable for performance reasons in some situations if you know what you're doing, but allows you to royally shoot yourself in the foot on deletes/updates to the parent table if you're not careful. If you have a lot of constraints and want to track down which one is unindexed, then doing an EXPLAIN ANALYZE of deleting a single row from the parent table will tell you how long each of the referential integrity checks takes, so you can figure out which indexes are missing. -- Mark Lewis On Tue, 2006-09-19 at 15:22 +0200, Ivan Voras wrote: > I've just fired off a "DELETE FROM table" command (i.e. unfiltered > DELETE) on a trivially small table but with many foreign key references > (on similar-sized tables), and I'm waiting for it to finish. It's been > 10 minutes now, which seems very excessive for a table of 9000 rows on a > 3 GHz desktop machine. > > 'top' says it's all spent in USER time, and there's a ~~500KB/s write > rate going on. Just before this DELETE, I've deleted data from a larger > table (50000 rows) using the same method and it finished in couple of > seconds - maybe it's a PostgreSQL bug? > > My question is: assuming it's not a bug, how to optimize DELETEs? > Increasing work_mem maybe? > > (I'm using PostgreSQL 8.1.4 on FreeBSD 6- amd64) > > (I know about TRUNCATE; I need those foreign key references to cascade) > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend
pgsql-performance by date: