Re: atrocious update performance - Mailing list pgsql-performance
From | Rosser Schwarz |
---|---|
Subject | Re: atrocious update performance |
Date | |
Msg-id | 004301c411be$74004650$2500fa0a@CardServices.TCI.com Whole thread Raw |
In response to | Re: atrocious update performance (Greg Spiegelberg <gspiegelberg@cranel.com>) |
Responses |
Re: atrocious update performance
|
List | pgsql-performance |
Greg Spiegelberg wrote: > > Will advise. After creating 100, 1K, 10K, 100K and 1M-row subsets of account.cust and the corresponding rows/tables with foreign key constraints referring to the table, I'm unable to reproduce the behavior at issue. explain analyze looks like the following, showing the query run with the join column indexed and not, respectively: # explain analyze update test.cust100 set prodid = tempprod.prodid, subprodid = tempprod.subprodid where origid = tempprod.debtid; -- with index QUERY PLAN ----------------------------------------------------------------------- Merge Join (cost=0.00..25.64 rows=500 width=220) (actual time=0.241..13.091 rows=100 loops=1) Merge Cond: (("outer".origid)::text = ("inner".debtid)::text) -> Index Scan using ix_origid_cust100 on cust100 (cost=0.00..11.50 rows=500 width=204) (actual time=0.125..6.465 rows=100 loops=1) -> Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (actual time=0.057..1.497 rows=101 loops=1) Total runtime: 34.067 ms (5 rows) -- without index QUERY PLAN ---------------------------------------------------------------------- Merge Join (cost=7.32..16.71 rows=100 width=220) (actual time=4.415..10.918 rows=100 loops=1) Merge Cond: (("outer".debtid)::text = "inner"."?column22?") -> Index Scan using ix_debtid on tempprod (cost=0.00..66916.71 rows=4731410 width=26) (actual time=0.051..1.291 rows=101 loops=1) -> Sort (cost=7.32..7.57 rows=100 width=204) (actual time=4.311..4.450 rows=100 loops=1) Sort Key: (cust100.origid)::text -> Seq Scan on cust100 (cost=0.00..4.00 rows=100 width=204) (actual time=0.235..2.615 rows=100 loops=1) Total runtime: 25.031 ms (7 rows) With the join column indexed, it takes roughly .32ms/row on the first four tests (100.. 100K), and about .48ms/row on 1M rows. Without the index, it runs 100 rows @ .25/row, 1000 @ .26, 10000 @ .27, 100000 @ .48 and .5 @ 1M rows. In no case does the query plan reflect foreign key validation. Failing any other suggestions for diagnosis in the soon, I'm going to nuke the PostgreSQL install, scour it from the machine and start from scratch. Failing that, I'm going to come in some weekend and re-do the machine. > Problem is when I recreate the indexes and add the constraints back > on ORIG I end up with the same long running process. The original > UPDATE runs for about 30 minutes on a table of 400,000 with the > WHERE matching about 70% of the rows. The above runs for about 2 > minutes without adding the constraints or indexes however adding the > constraints and creating the dropped indexes negates any gain. Is this a frequently-run update? In my experience, with my seemingly mutant install, dropping indices and constraints to shave 14/15 off the update time would be worth the effort. Just script dropping, updating and recreating into one large transaction. It's a symptom-level fix, but re-creating the fifteen indices on one of our 5M row tables doesn't take 28 minutes, and your hardware looks to be rather less IO and CPU bound than ours. I'd also second Tom's suggestion of moving to 7.4. /rls
pgsql-performance by date: