Re: Updates are slow.. - Mailing list pgsql-general
From | Tom Burke |
---|---|
Subject | Re: Updates are slow.. |
Date | |
Msg-id | NDBBIIDKBFNMNPDNLCDDKEALDCAA.lists@spamex.com Whole thread Raw |
In response to | Re: Updates are slow.. (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Updates are slow..
Re: Updates are slow.. |
List | pgsql-general |
I think I spoke too soon, and it appears the problem is not solved. An index on the email column slows down an update on the dept column, for some unkown reason. > Do you have enough of a log of your tests > so far to be able to recreate the pre-reindex situation? Here are the stats on emp from before the reindexing: 2002-06-11 09:03:01 DEBUG: Index emp_ix01: Pages 18568; Tuples 1230703: Deleted 597153. CPU 2.75s/5.07u sec elapsed 77.46 sec. 2002-06-11 09:06:48 DEBUG: Index emp_ix02: Pages 27572; Tuples 1230703: Deleted 597153. CPU 5.73s/10.16u sec elapsed 227.15 sec. 2002-06-11 09:06:59 DEBUG: Removed 597153 tuples in 8227 pages. CPU 1.42s/1.51u sec elapsed 10.84 sec. 2002-06-11 09:06:59 DEBUG: Pages 24552: Changed 16249, Empty 0; Tup 1230703: Vac 597153, Keep 0, UnUsed 4084. Total CPU 14.64s/23.74u sec elapsed 399.48 sec. After doing the Reindex table emp, the stats were: 2002-06-11 15:44:37 NOTICE: --Relation emp-- 2002-06-11 15:45:15 NOTICE: Index emp_pk: Pages 4018; Tuples 1230703: Deleted 597153. CPU 0.87s/9.20u sec elapsed 15.27 sec. 2002-06-11 15:45:33 NOTICE: Index emp_ix01: Pages 5445; Tuples 1230703: Deleted 597153. CPU 0.96s/6.62u sec elapsed 17.83 sec. 2002-06-11 15:46:51 NOTICE: Index emp_ix02: Pages 13972; Tuples 1230703: Deleted 597153. CPU 2.73s/8.95u sec elapsed 78.56 sec. 2002-06-11 15:47:13 NOTICE: Removed 1194306 tuples in 16048 pages. CPU 3.28s/2.91u sec elapsed 21.83 sec. 2002-06-11 15:47:13 NOTICE: Pages 32576: Changed 16469, Empty 0; Tup 1230703: Vac 1194306, Keep 0, UnUsed 5501. Total CPU 11.32s/28.43u sec elapsed 155.81 sec. > It would be interesting to try making a third index on just email and > see where the update time goes to. Very interesting indeed. eppend=# create index emp_ix03 on emp(email); After creating this index, the update time jumped up again to 21 mins! real 21m3.536s user 0m0.010s sys 0m0.000s This makes no sense to me because the new index is not related to the column being updated at all. After this update, I did a vacuum : eppend=# vacuum verbose analyze clientdata; NOTICE: --Relation emp-- NOTICE: Index emp_pk: Pages 4012; Tuples 1230703: Deleted 597153. CPU 0.82s/9.40u sec elapsed 20.02 sec. NOTICE: Index emp_ix01: Pages 5463; Tuples 1230703: Deleted 597153. CPU 0.93s/6.77u sec elapsed 19.54 sec. NOTICE: Index emp_ix03: Pages 12602; Tuples 1230703: Deleted 597153. CPU 2.73s/8.51u sec elapsed 63.37 sec. NOTICE: Removed 1194306 tuples in 17145 pages. CPU 3.44s/2.98u sec elapsed 46.65 sec. NOTICE: Pages 40578: Changed 17679, Empty 0; Tup 1230703: Vac 1194306, Keep 0, UnUsed 606138. Total CPU 12.22s/28.50u sec elapsed 177.13 sec. NOTICE: Analyzing emp VACUUM I reran the update, and it still took 18 minutes. > Also, could I trouble you for some > stats on the email column? Min, max, and average length of the entries > would be useful to know. eppend=# select min(length(email)), max(length(email)), avg(length(email)) eppend-# from emp; min | max | avg -----+-----+--------------- 1 | 50 | 19.6686072919 > And I assume it's unique, or nearly so? Actually, it is not unique. This is not really an emp table, and in fact ~600K of the email addresses are duplicates. eppend=# select fk_dept_id, count(*) from emp group by fk_dept_id; fk_dept_id | count ------------+-------- 2 | 597152 3 | 36398 4 | 597153 That is, there are 3 departments right now, and departments 2 and 4 have the exact same row count and have exactly the same email addresses. Could that be a factor? I truly have no idea why Postgres would behave this way. Help! Thanks, Tom Burke Eppend, Inc. http://www.eppend.com/
pgsql-general by date: