Thread: Updates are slow..
Hello, We've been using PostgreSQL for a few months and it has been running much slower than anticipated. We've tried playing with configuration options with limited results, and I'm hoping someone may have some tips. For the purposes of our performance test, we created a PostgreSQL database with only a single table of about 1.2 million records. Updating an indexed column for 600K records on that table is unusually slow. It takes anywhere from forty minutes to over an hour. To describe the situation, I'm going to call our table EMP (employee). Our EMP has 1.2MM records, and 600K of those records have a Dept_ID of 4. Our EMP has 20 columns, and there are two indexes on EMP: 1) (EMP_ID) 2) (DEPT_ID, EMP_STATUS) The update statement that takes so long is: UPDATE Emp SET Dept_ID = 5 WHERE Dept_ID = 4; Thus it is reading from the index and also having to update it at the same time. As I mentioned, 600K records are impacted by this update. An out of the box configuration of PostgreSQL takes over an hour to do this. I've tried changing the default settings as follows: shared_buffers = 8192 sort_mem = 15360 fsync = false checkpoint_segments = 15 wal_files = 10 wal_sync_method = fdatasync With these settings, it still takes 39 minutes. Here are the executor statistics from the last test we did yesterday: 2002-06-10 13:15:18 DEBUG: EXECUTOR STATISTICS ! system usage stats: ! 2315.929107 elapsed 142.990000 user 87.310000 system sec ! [163.390000 user 93.610000 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 8/14 [16609/5363] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 552060 read, 476736 written, buffer hit rate = 95.33% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written For the basis of comparison, we downloaded Oracle, created the same table, and loaded the same data. With Oracle, the same update took at most 5 minutes! Our machine is a PIII 850 with 768MB RAM and an ATA100 40GB drive and RH Linux 7.2. Postgres is v7.2. Every time we run the test on Postgres, we a vacuum analyze immediately before. Foreign key constraints are not enabled. I would not expect the performance difference between Oracle and Postgres to be so large. While the example we chose for our test may not happen that often (updating an indexed column), various other queries we run on our full system have not performed to our expectations. I'm hoping we're overlooking something basic. If anyone has any suggestions, I would greatly appreciate it. Thanks, Tom Burke Eppend, Inc. http://www.eppend.com
send the output of explain and explain analyse for the query this may help determine what is happening during the query. Darren Ferguson On Tue, 11 Jun 2002, Tom Burke wrote: > Hello, > > We've been using PostgreSQL for a few months and it has been running > much slower than anticipated. We've tried playing with configuration > options with limited results, and I'm hoping someone may have some tips. > > For the purposes of our performance test, we created a PostgreSQL > database with only a single table of about 1.2 million records. Updating > an indexed column for 600K records on that table is unusually slow. It > takes anywhere from forty minutes to over an hour. > > To describe the situation, I'm going to call our table EMP (employee). > Our EMP has 1.2MM records, and 600K of those records have a Dept_ID of 4. > Our EMP has 20 columns, and there are two indexes on EMP: > 1) (EMP_ID) > 2) (DEPT_ID, EMP_STATUS) > > The update statement that takes so long is: > UPDATE Emp > SET Dept_ID = 5 > WHERE Dept_ID = 4; > > Thus it is reading from the index and also having to update it at the > same time. As I mentioned, 600K records are impacted by this update. An > out of the box configuration of PostgreSQL takes over an hour to do this. > > I've tried changing the default settings as follows: > shared_buffers = 8192 > sort_mem = 15360 > fsync = false > checkpoint_segments = 15 > wal_files = 10 > wal_sync_method = fdatasync > > With these settings, it still takes 39 minutes. Here are the executor > statistics from the last test we did yesterday: > 2002-06-10 13:15:18 DEBUG: EXECUTOR STATISTICS > ! system usage stats: > ! 2315.929107 elapsed 142.990000 user 87.310000 system sec > ! [163.390000 user 93.610000 sys total] > ! 0/0 [0/0] filesystem blocks in/out > ! 8/14 [16609/5363] page faults/reclaims, 0 [0] swaps > ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent > ! 0/0 [0/0] voluntary/involuntary context switches > ! postgres usage stats: > ! Shared blocks: 552060 read, 476736 written, buffer hit rate = > 95.33% > ! Local blocks: 0 read, 0 written, buffer hit rate = > 0.00% > ! Direct blocks: 0 read, 0 written > > > For the basis of comparison, we downloaded Oracle, created the same table, > and loaded the same data. With Oracle, the same update took at most 5 minutes! > > Our machine is a PIII 850 with 768MB RAM and an ATA100 40GB drive and RH Linux > 7.2. Postgres is v7.2. Every time we run the test on Postgres, we a vacuum > analyze immediately before. Foreign key constraints are not enabled. > > I would not expect the performance difference between Oracle and Postgres to > be so large. While the example we chose for our test may not happen that often > (updating an indexed column), various other queries we run on our full system > have not performed to our expectations. > > I'm hoping we're overlooking something basic. If anyone has any suggestions, > I would greatly appreciate it. > > Thanks, > > Tom Burke > Eppend, Inc. > http://www.eppend.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
"Tom Burke" <lists@spamex.com> writes: > For the purposes of our performance test, we created a PostgreSQL > database with only a single table of about 1.2 million records. Updating > an indexed column for 600K records on that table is unusually slow. It > takes anywhere from forty minutes to over an hour. That seems way way off. I tried to replicate your situation on an RH 7.2 box, using PG 7.1 (I don't have 7.2 installed there at the moment, and current devel sources might be an unfair comparison). I got a time more like forty seconds: test71=# create table emp (emp_id serial, dept_id int, emp_status int); test71=# create index emp2 on emp(DEPT_ID, EMP_STATUS); [ fill with random data ] [tgl@rh1 tgl]$ time psql test71 -c "UPDATE Emp SET Dept_ID = 5 WHERE Dept_ID = 4;" UPDATE 593216 real 0m41.290s user 0m0.000s sys 0m0.000s [tgl@rh1 tgl]$ Now this machine has a faster CPU than yours (1.8GHz I think), but probably not any faster disk. At best it could be twice as fast as yours. I don't have as many columns either, so the I/O volume is probably a good bit more in your case. But even allowing for that, it's hard to extrapolate to an hour. What are the datatypes of your columns, exactly? What's the average tuple size (actually, showing the VACUUM VERBOSE stats for the table would be the most useful answer)? Are you *sure* there are no foreign keys either from or to this table? Also, what plan is shown by EXPLAIN for the query? (7.2 should surely not be dumb enough to pick an indexscan plan, but if it did that would explain a lot ...) regards, tom lane
It would be also a good idea if you could send us the output of your system "ipcs" command so we can get an idea of the amount of shared memory you are using. - Ericson Smith eric@did-it.com http://www.did-it.com On Tue, 2002-06-11 at 15:22, Tom Lane wrote: > "Tom Burke" <lists@spamex.com> writes: > > For the purposes of our performance test, we created a PostgreSQL > > database with only a single table of about 1.2 million records. Updating > > an indexed column for 600K records on that table is unusually slow. It > > takes anywhere from forty minutes to over an hour. > > That seems way way off. I tried to replicate your situation on an RH > 7.2 box, using PG 7.1 (I don't have 7.2 installed there at the moment, > and current devel sources might be an unfair comparison). I got a time > more like forty seconds: > > test71=# create table emp (emp_id serial, dept_id int, emp_status int); > test71=# create index emp2 on emp(DEPT_ID, EMP_STATUS); > [ fill with random data ]
Well, interesting developments - mostly our error. I'm going to overkill on information just to be thorough (and because I already collected it :). Fran Fabrizio had an extremely helpful suggestion: > No idea if this would help but try REINDEX TABLE EMP; beforehand too. > VACUUM ANALYZE does not free space from indexes; only tables. I just > discovered this last month after a year of using Pg. After trying this, the time was cut from 40 minutes to 20 minutes, which is a huge improvement but still quite slow. Tom, many thanks for testing it out. Here are the answers to your questions. > What are the datatypes of your columns, exactly? Because we've solved the problem, I'll just cut it down to the columns we have data in: Column | Type -------------------------+-------------------------- emp_id | integer | not null email | character varying(255) | emp_code | character varying(50) | created_dt | timestamp with time zone | not null fk_emp_status_id | smallint | not null fk_dept_id | integer | not null fk_role_id | integer | We actually have 10 more columns, but i no longer think it's relevant. > What's the average > tuple size (actually, showing the VACUUM VERBOSE stats for the table > would be the most useful answer)? eppend=# vacuum verbose clientdata; NOTICE: --Relation clientdata-- NOTICE: Index emp_pk: Pages 4018; Tuples 1230703: Deleted 597153. CPU 0.87s/9.20u sec elapsed 15.27 sec. NOTICE: Index emp_ix01: Pages 5445; Tuples 1230703: Deleted 597153. CPU 0.96s/6.62u sec elapsed 17.83 sec. NOTICE: Index emp_ix02: Pages 13972; Tuples 1230703: Deleted 597153. CPU 2.73s/8.95u sec elapsed 78.56 sec. NOTICE: Removed 1194306 tuples in 16048 pages. CPU 3.28s/2.91u sec elapsed 21.83 sec. NOTICE: Pages 32576: Changed 16469, Empty 0; Tup 1230703: Vac 1194306, Keep 0, UnUsed 5501. Total CPU 11.32s/28.43u sec elapsed 155.81 sec. VACUUM The vacuum made me realize that there was an index I had forgotten about. emp_pk is (emp_id) and emp_ix01 is (fk_dept_id, fk_emp_status_id). More on emp_ix02 in a second, which is the real cause of the problem. > Are you *sure* there are no foreign > keys either from or to this table? Yes, actually we've dropped all other tables for simplicity of the test. This is the only table in the database. > Also, what plan is shown by EXPLAIN for the query? > regards, tom lane This explain analyse I ran took some 20+ minutes itself. eppend=# explain analyse eppend-# update emp eppend-# set fk_dept_id = 5 eppend-# where fk_dept_id= 4; NOTICE: QUERY PLAN: Seq Scan on clientdata (cost=0.00..47959.79 rows=915643 width=522) (actual time=1764.06..362296.23 rows=597153 loops=1) Total runtime: 1441799.02 msec EXPLAIN > It would be also a good idea if you could send us the output of your > system "ipcs" command so we can get an idea of the amount of shared > memory you are using. > > - Ericson Smith bash-2.05$ ipcs ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x0052e2c1 0 postgres 600 68878336 4 0x00000000 163841 root 777 196608 2 dest 0x00000000 851970 root 777 196608 2 dest 0x00000000 1900547 root 777 196608 2 dest 0x00000000 2031620 root 777 196608 2 dest ------ Semaphore Arrays -------- key semid owner perms nsems status 0x0052e2c1 0 postgres 600 17 0x0052e2c2 32769 postgres 600 17 0x0052e2c3 65538 postgres 600 17 ------ Message Queues -------- key msqid owner perms used-bytes messages However, the real kicker is that I looked up what the emp_ix02 index was and it was (email, fk_dept_id). Therefore, there were actually two indexes on this column that had to be updated, and this index had a lot more pages. When I dropped this index and re-ran the updated - it took only 2 minutes! So with the combination of the REINDEX and dropping a troublesome index (we'll recreate the email index without dept_id), we're down to a good time. I'm hoping reindex will help with some of the other slow queries we've seen. Thanks again for everyone's help! Tom Burke Eppend, Inc. http://www.eppend.com/
"Tom Burke" <lists@spamex.com> writes: > Fran Fabrizio had an extremely helpful suggestion: >> No idea if this would help but try REINDEX TABLE EMP; beforehand too. > After trying this, the time was cut from 40 minutes to 20 minutes, > which is a huge improvement but still quite slow. > However, the real kicker is that I looked up what the emp_ix02 index > was and it was (email, fk_dept_id). Therefore, there were actually > two indexes on this column that had to be updated, and this index > had a lot more pages. When I dropped this index and re-ran the updated > - it took only 2 minutes! Very interesting. The emp_ix02 index was not all that much bigger than the others, according to vacuum's numbers, so I don't see why it should take so long to update. Maybe we've got some kind of performance problem with multicolumn indexes? I'm wondering if the initial 20-minute savings from reindex was mostly attributable to emp_ix02 as well. Too bad we don't know how large the indexes were before reindex. Do you have enough of a log of your tests so far to be able to recreate the pre-reindex situation? It would be interesting to try making a third index on just email and see where the update time goes to. Also, could I trouble you for some stats on the email column? Min, max, and average length of the entries would be useful to know. And I assume it's unique, or nearly so? regards, tom lane
I think I spoke too soon, and it appears the problem is not solved. An index on the email column slows down an update on the dept column, for some unkown reason. > Do you have enough of a log of your tests > so far to be able to recreate the pre-reindex situation? Here are the stats on emp from before the reindexing: 2002-06-11 09:03:01 DEBUG: Index emp_ix01: Pages 18568; Tuples 1230703: Deleted 597153. CPU 2.75s/5.07u sec elapsed 77.46 sec. 2002-06-11 09:06:48 DEBUG: Index emp_ix02: Pages 27572; Tuples 1230703: Deleted 597153. CPU 5.73s/10.16u sec elapsed 227.15 sec. 2002-06-11 09:06:59 DEBUG: Removed 597153 tuples in 8227 pages. CPU 1.42s/1.51u sec elapsed 10.84 sec. 2002-06-11 09:06:59 DEBUG: Pages 24552: Changed 16249, Empty 0; Tup 1230703: Vac 597153, Keep 0, UnUsed 4084. Total CPU 14.64s/23.74u sec elapsed 399.48 sec. After doing the Reindex table emp, the stats were: 2002-06-11 15:44:37 NOTICE: --Relation emp-- 2002-06-11 15:45:15 NOTICE: Index emp_pk: Pages 4018; Tuples 1230703: Deleted 597153. CPU 0.87s/9.20u sec elapsed 15.27 sec. 2002-06-11 15:45:33 NOTICE: Index emp_ix01: Pages 5445; Tuples 1230703: Deleted 597153. CPU 0.96s/6.62u sec elapsed 17.83 sec. 2002-06-11 15:46:51 NOTICE: Index emp_ix02: Pages 13972; Tuples 1230703: Deleted 597153. CPU 2.73s/8.95u sec elapsed 78.56 sec. 2002-06-11 15:47:13 NOTICE: Removed 1194306 tuples in 16048 pages. CPU 3.28s/2.91u sec elapsed 21.83 sec. 2002-06-11 15:47:13 NOTICE: Pages 32576: Changed 16469, Empty 0; Tup 1230703: Vac 1194306, Keep 0, UnUsed 5501. Total CPU 11.32s/28.43u sec elapsed 155.81 sec. > It would be interesting to try making a third index on just email and > see where the update time goes to. Very interesting indeed. eppend=# create index emp_ix03 on emp(email); After creating this index, the update time jumped up again to 21 mins! real 21m3.536s user 0m0.010s sys 0m0.000s This makes no sense to me because the new index is not related to the column being updated at all. After this update, I did a vacuum : eppend=# vacuum verbose analyze clientdata; NOTICE: --Relation emp-- NOTICE: Index emp_pk: Pages 4012; Tuples 1230703: Deleted 597153. CPU 0.82s/9.40u sec elapsed 20.02 sec. NOTICE: Index emp_ix01: Pages 5463; Tuples 1230703: Deleted 597153. CPU 0.93s/6.77u sec elapsed 19.54 sec. NOTICE: Index emp_ix03: Pages 12602; Tuples 1230703: Deleted 597153. CPU 2.73s/8.51u sec elapsed 63.37 sec. NOTICE: Removed 1194306 tuples in 17145 pages. CPU 3.44s/2.98u sec elapsed 46.65 sec. NOTICE: Pages 40578: Changed 17679, Empty 0; Tup 1230703: Vac 1194306, Keep 0, UnUsed 606138. Total CPU 12.22s/28.50u sec elapsed 177.13 sec. NOTICE: Analyzing emp VACUUM I reran the update, and it still took 18 minutes. > Also, could I trouble you for some > stats on the email column? Min, max, and average length of the entries > would be useful to know. eppend=# select min(length(email)), max(length(email)), avg(length(email)) eppend-# from emp; min | max | avg -----+-----+--------------- 1 | 50 | 19.6686072919 > And I assume it's unique, or nearly so? Actually, it is not unique. This is not really an emp table, and in fact ~600K of the email addresses are duplicates. eppend=# select fk_dept_id, count(*) from emp group by fk_dept_id; fk_dept_id | count ------------+-------- 2 | 597152 3 | 36398 4 | 597153 That is, there are 3 departments right now, and departments 2 and 4 have the exact same row count and have exactly the same email addresses. Could that be a factor? I truly have no idea why Postgres would behave this way. Help! Thanks, Tom Burke Eppend, Inc. http://www.eppend.com/
"Tom Burke" <lists@spamex.com> writes: > After creating this index, the update time jumped up again to 21 mins! > real 21m3.536s > user 0m0.010s > sys 0m0.000s > This makes no sense to me because the new index is not related to > the column being updated at all. Doesn't matter: we're entering a new tuple (new version of the row) so new entries must be made in all indexes for the table. > Actually, it is not unique. > This is not really an emp table, and in fact ~600K of the email > addresses are duplicates. Ah so. The btree code doesn't really like huge numbers of duplicate keys --- it still works, but the algorithms degenerate to O(N) instead of O(log N). I imagine the previous two-column incarnation with (email, dept_id) was also quite non-unique? So the bottom line is not to bother with making indexes on highly redundant columns; they aren't fast and they don't do you any good anyway. If the situation is that you've got, say, 600K empty email addresses and another 600K that are actually useful, you might consider making a partial index that excludes the empty addresses; then that could be used to look up real addresses without paying the overhead to index the empty ones. See http://www.postgresql.org/users-lounge/docs/7.2/postgres/indexes-partial.html regards, tom lane
On Wed, 12 Jun 2002 10:04:54 -0400, "Tom Burke" <lists@spamex.com> wrote: >An index on the email column slows down an update on the dept column, >for some unkown reason. > [...] >eppend=# create index emp_ix03 on emp(email); > >After creating this index, the update time jumped up again to 21 mins! >[...] >This makes no sense to me because the new index is not related to >the column being updated at all. You seem to think that an UPDATE of one column should not affect an index on another column. However, with PostgreSQL's MVCC (Multi Version Concurrency Control) an UPDATE behaves like a DELETE followed by an INSERT, so to speak. >This is not really an emp table, and in fact ~600K of the email >addresses are duplicates. Ouch! Are these real world data? Could you repeat your tests after making email unique? First VACUUM ANALYZE; then please time this statement: UPDATE emp SET email = email || oid; Then VACUUM ANALYZE and repeat your test. I'm curious what you get. Servus Manfred
If I make the index unique on the email column, the updates slows down an additional 5 minutes to a total of 25 minutes. I realize I did not understand how PostgreSQL handles updates, but the performance implications are huge. It is not unreasonable to have 3 indexes on a table with a number of columns. With the way that PostgreSQL handles an update, any update to that table will impact the entire data row and all of the indexes, which seems invariably slow. At Manfred's suggestion, I made the email column unique: $ time psql -c "update emp set email = email||oid;" UPDATE 1230703 real 4990m31.151s user 0m0.010s sys 0m0.000s There was no index on the email column when I did that update. I then created an index emp_ix03 on emp (email). I then did the fk_dept_id update, which took 25 minutes 52 seconds. A vacuum afterwards shows: NOTICE: Index emp_ix03: Pages 14689; Tuples 1230703: Deleted 597153. CPU 2.58s/8.88u sec elapsed 79.83 sec. Having unique data in the indexed column does benefit this process. It really seems that the mere fact of another index with keys in it slows down the update. Yet I can't imagine I'm the first person to experience this. Thoughts? Thanks, Tom
I didn't receive any response to my last post, but I've got a little more info and have 2 questions. I dropped the index on email. I then populated a smallint field with random value from 0 - 2000 and created an in index on that field. Now an update on fk_dept_id takes 7 minutes, as opposed to 2 minutes without any additional indexes and 25 minutes with an index on email. The index on the smallint has these stats: NOTICE: Index emp_ix04: Pages 4959; Tuples 1230703: Deleted 597153. CPU 1.05s/9.84u sec elapsed 29.49 sec. Which is 1/3 of the pages that the index on email had. My two questions: 1) Does the 25 minute update of 600K rows with an index on the email column still seem out of line or is there not much we can do? 2) Do people think it's a good idea to get a PostgreSQL performance tuning consultant to review our setup? If so, any recommendations for who could help? Many thanks in advance! Tom Burke
it might help to see the explain output of the update query, with and without the indexes in question... "Tom Burke" <lists%spamex.com@interlock.lexmark.com> on 06/13/2002 11:47:12 AM To: pgsql-general%postgresql.org@interlock.lexmark.com cc: (bcc: Wesley Sheldahl/Lex/Lexmark) Subject: Re: [GENERAL] Updates are slow.. I didn't receive any response to my last post, but I've got a little more info and have 2 questions. I dropped the index on email. I then populated a smallint field with random value from 0 - 2000 and created an in index on that field. Now an update on fk_dept_id takes 7 minutes, as opposed to 2 minutes without any additional indexes and 25 minutes with an index on email. The index on the smallint has these stats: NOTICE: Index emp_ix04: Pages 4959; Tuples 1230703: Deleted 597153. CPU 1.05s/9.84u sec elapsed 29.49 sec. Which is 1/3 of the pages that the index on email had. My two questions: 1) Does the 25 minute update of 600K rows with an index on the email column still seem out of line or is there not much we can do? 2) Do people think it's a good idea to get a PostgreSQL performance tuning consultant to review our setup? If so, any recommendations for who could help? Many thanks in advance! Tom Burke ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
On Thu, 13 Jun 2002, Tom Burke wrote: > I didn't receive any response to my last post, but I've got a little > more info and have 2 questions. > > I dropped the index on email. > I then populated a smallint field with random value from 0 - 2000 > and created an in index on that field. Have you tried other indexing methods (rtree, hash, gist) to see if any of them work better for your data set? I'd try that first.
"Tom Burke" <lists@spamex.com> writes: > 1) Does the 25 minute update of 600K rows with an index on the > email column still seem out of line or is there not much we can do? Still seems out of line to me, but I'm unsure where the problem lies. First off, is the system CPU-bound or I/O bound while this update is running? regards, tom lane