Thread: slow table updates
I'm trying to update a table but it's taking a very long time. I would appreciate any tips folks may have about ways to speed it up.
The table is paprospect2, as below:
paprospect2 contains ~40M rows. The goal now is to migrate the data to the supertable-inherited column with
The update's been running for 5 hours (unloaded dual 2.4 GHz Xeon w/2GB RAM, SCSI160 10K drive). There are no other jobs running. Load is ~1.2 and the update's using ~3-5% of the CPU.
Presumably the large number of blocks written (bo) versus blocks read (bi) reflects an enormous amount of bookkeeping that has to be done for MVCC, logging, perhaps rewriting a row for the new definition (a guess -- I don't know how this is handled), indicies, etc. There's no swapping and no processes are waiting. In short, it seems that this is ENTIRELY an I/O issue. Obviously, faster drives will help (but probably only by small factor).
Any ideas how I might speed this up? Presumably this is all getting wrapped in a transaction -- does that hurt me for such a large update?
Thanks,
Reece
Bonus diversionary topic: In case it's not obvious, the motivation for this is that the subtable (paprospect2) contains a column (run_id) whose definition I would like to migrate to the inherited table (i.e., the 'super-table'). Although postgresql permits adding a column to a supertable with the same name as an extant column in a subtable, it appears that such "merged definition" columns do not have the same properties as a typical inherited column. In particular, dropping the column from the supertable does not drop it from the subtable (but renaming it does change both names). Hmm.
The table is paprospect2, as below:
The columns pfeature_id..confidence and run_id_new (in red) are from an inherited table. Although the inheritance itself is probably not relevant here (correction welcome), I suspect it may be relevant that all existing rows were written before the table definition included run_id_new. p2thread_i_trigger is defined fires on insert only (not update).\d paprospect2 Column | Type | Modifiers -------------+---------+------------------------------------------------------------------- pfeature_id | integer | not null default nextval('unison.pfeature_pfeature_id_seq'::text)pseq_id | integer | not nullpftype_id | integer | not nullstart | integer |stop | integer |confidence | real |run_id | integer | not null[snip 13 integer and real columns] run_id_new | integer | Indexes: paprospect2_redundant_alignment unique btree (pseq_id, "start", stop, run_id, pmodel_id), p2thread_p2params_id btree (run_id), p2thread_pmodel_id btree (pmodel_id) Foreign Key constraints: pftype_id_exists FOREIGN KEY (pftype_id) REFERENCES pftype(pftype_id) ON UPDATE CASCADE ON DELETE CASCADE, p2thread_pmodel_id_exists FOREIGN KEY (pmodel_id) REFERENCES pmprospect2(pmodel_id) ON UPDATE CASCADE ON DELETE CASCADE, pseq_id_exists FOREIGN KEY (pseq_id) REFERENCES pseq(pseq_id) ON UPDATE CASCADE ON DELETE CASCADE Triggers: p2thread_i_trigger
paprospect2 contains ~40M rows. The goal now is to migrate the data to the supertable-inherited column with
update paprospect2 set run_id_new=run_id;
The update's been running for 5 hours (unloaded dual 2.4 GHz Xeon w/2GB RAM, SCSI160 10K drive). There are no other jobs running. Load is ~1.2 and the update's using ~3-5% of the CPU.
This suggests that the update is I/O bound (duh) and vmstat supports this:$ ps -ostime,time,pcpu,cmd 28701 STIME TIME %CPU CMD 12:18 00:07:19 2.3 postgres: admin csb 128.137.116.213 UPDATE
$ vmstat 1 procs memory swap io system cpur b w swpd free buff cache si so bi bo in cs us sy id0 1 0 0 11288 94632 3558960 0 0 14 6 12 21 1 0 60 1 0 0 12044 94632 3558956 0 0 0 972 332 16 0 1 990 1 0 0 11092 94632 3558932 0 0 16 4420 309 25 0 2 970 1 0 0 11456 94636 3558928 0 0 0 980 326 23 0 1 991 0 0 0 12340 94636 3558924 0 0 16 532 329 14 0 0 1000 1 0 0 12300 94636 3558916 0 0 0 1376 324 16 1 0 990 1 0 0 12252 94636 3558904 0 0 16 1888 325 18 0 0 990 1 0 0 11452 94636 3558888 0 0 16 2864 324 23 1 1 980 1 0 0 12172 94636 3558884 0 0 0 940 320 12 0 1 990 1 0 0 12180 94636 3558872 0 0 16 1840 318 22 0 1 990 1 0 0 11588 94636 3558856 0 0 0 2752 312 16 1 2 97
Presumably the large number of blocks written (bo) versus blocks read (bi) reflects an enormous amount of bookkeeping that has to be done for MVCC, logging, perhaps rewriting a row for the new definition (a guess -- I don't know how this is handled), indicies, etc. There's no swapping and no processes are waiting. In short, it seems that this is ENTIRELY an I/O issue. Obviously, faster drives will help (but probably only by small factor).
Any ideas how I might speed this up? Presumably this is all getting wrapped in a transaction -- does that hurt me for such a large update?
Thanks,
Reece
Bonus diversionary topic: In case it's not obvious, the motivation for this is that the subtable (paprospect2) contains a column (run_id) whose definition I would like to migrate to the inherited table (i.e., the 'super-table'). Although postgresql permits adding a column to a supertable with the same name as an extant column in a subtable, it appears that such "merged definition" columns do not have the same properties as a typical inherited column. In particular, dropping the column from the supertable does not drop it from the subtable (but renaming it does change both names). Hmm.
-- Reece Hart, Ph.D. rkh@gene.com, http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93 http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 reece@in-machina.com, GPG: 0x25EC91A0 |
On Wednesday 23 July 2003 01:40, Reece Hart wrote: > I'm trying to update a table but it's taking a very long time. I would > appreciate any tips folks may have about ways to speed it up. [snip] > paprospect2 contains ~40M rows. The goal now is to migrate the data to > the supertable-inherited column with > > update paprospect2 set run_id_new=run_id; > > > The update's been running for 5 hours (unloaded dual 2.4 GHz Xeon w/2GB > RAM, SCSI160 10K drive). There are no other jobs running. Load is ~1.2 > and the update's using ~3-5% of the CPU. [snip] > This suggests that the update is I/O bound (duh) and vmstat supports > this: [snip] > Presumably the large number of blocks written (bo) versus blocks read > (bi) reflects an enormous amount of bookkeeping that has to be done for > MVCC, logging, perhaps rewriting a row for the new definition (a guess > -- I don't know how this is handled), indicies, etc. There's no swapping > and no processes are waiting. In short, it seems that this is ENTIRELY > an I/O issue. Obviously, faster drives will help (but probably only by > small factor). > > Any ideas how I might speed this up? Presumably this is all getting > wrapped in a transaction -- does that hurt me for such a large update? Well, it needs to keep enought bookkeeping to be able to rollback the whole transaction if it encounters a problem, or 40M rows in your case. Looks like you're right and it's an I/O issue. I must admit, I'm a bit puzzled that your CPU is quite so low, but I suppose you've got two fast CPUs so it shouldn't be high. [note the following is more speculation than experience] What might be happening is that the drive is spending all its time seeking between the WAL, index and table as it updates. I would also tend to be suspicious of the foreign keys - PG might be re-checking these, and obviously that would take time too. What you might want to try in future: 1. begin transaction 2. drop indexes, foreign keys 3. update table 4. vacuum it 5. recreate indexes, foreign keys etc 6. commit Now that's just moving the index updating/fk stuff to the end of the task, but it does seem to help sometimes. HTH -- Richard Huxton Archonet Ltd
Richard-
Thanks for the suggestions. I too had thought about the FK checks, even though the columns aren't getting updated.
I'm flabbergasted that the update is still running (~22 hours elapsed). By comparison, the database takes only 4 hours to recreate from backup! Something funny is happening here. I just interrupted the update and will find another way.
But hang on, what's this:
I am the only user and I'm not doing a copy... this must be part of the update process. Does anyone out there know whether updates do a table copy instead of in-table udpating (perhaps as a special case for whole-table updates)?
Of course, I can't help but wonder whether I just killed it when it was nearly done...
Thanks,
Reece
Thanks for the suggestions. I too had thought about the FK checks, even though the columns aren't getting updated.
I'm flabbergasted that the update is still running (~22 hours elapsed). By comparison, the database takes only 4 hours to recreate from backup! Something funny is happening here. I just interrupted the update and will find another way.
But hang on, what's this:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 30164 compbio 25 0 6056 6056 3568 R 98.8 0.1 29:55 postgres: admin csb [local] COPY
I am the only user and I'm not doing a copy... this must be part of the update process. Does anyone out there know whether updates do a table copy instead of in-table udpating (perhaps as a special case for whole-table updates)?
Of course, I can't help but wonder whether I just killed it when it was nearly done...
Thanks,
Reece
-- Reece Hart, Ph.D. rkh@gene.com, http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93 http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 reece@in-machina.com, GPG: 0x25EC91A0 |
On Wed, 2003-07-23 at 10:47, Guthrie, Jeremy wrote:
So, what does this tell me? I'm guessing that you're implying that I should expect 8192 keys per page, and that this therefore indicates the sparseness of the key pages. Guessing that, I did:
What do you make of 'em apples?
Thanks,
Reece
Have you checked the sizes of your indexes? You may need to rebuild them... Multiply the relpages colum by 8192.
So, what does this tell me? I'm guessing that you're implying that I should expect 8192 keys per page, and that this therefore indicates the sparseness of the key pages. Guessing that, I did:
rkh@csb=> SELECT c2.relname, c2.relpages, c2.relpages*8192 as "*8192", 43413476::real/(c2.relpages*8192) FROM pg_class c, pg_class c2, pg_index i where c.oid = i.indrelid AND c2.oid = i.indexrelid and c2.relname~'^p2th|^papro' ORDER BY c2.relname; relname | relpages | *8192 | ?column? ---------------------------------+----------+------------+--------------------p2thread_p2params_id | 122912 | 1006895104 | 0.0431161854174633p2thread_pmodel_id | 123243 | 1009606656 | 0.0430003860830331paprospect2_redundant_alignment | 229934 | 1883619328 | 0.0230479032332376
What do you make of 'em apples?
Thanks,
Reece
-- Reece Hart, Ph.D. rkh@gene.com, http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93 http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 reece@in-machina.com, GPG: 0x25EC91A0 |
Look at it like this(this is how this affected me): I had a table that use to be the primary home for my data(6 gigs worth). I copied out and copied to another table. I purgedand then I 'vacuum full''d the database. After a day things really started going to hell. SLOOOW.. like 30 minutesto run my software versus the 1-5 seconds it normally takes. The old table is still used but I use it to queue up data. After the data is processed, it is deleted. Mind you that therepurposed 'queue' table usually has no more than 3000-10000 entries in it. Guess what the index size was..... all toldI had 7 gigs of indexes. Why? Because vacuum doesn't reoptimize the indexes. If postgresql can't use a deleted row'sindex entry, it creates a new one. The docs make it sound that if the difference between the values of the deletedrows vs the new row aren't close, it can't use the old index space. Look in the docs about reindexing to see theirexplanation. So back to my example, my table should maybe be 100K w/ indexes but it was more like 7 gigs. I re-indexedand BAM! My times were sub-second. Based on the information you have below, you have 3 gigs worth of indexes. Do you have that much data(in terms of rows)? -----Original Message----- From: Reece Hart [mailto:rkh@gene.COM] Sent: Wed 7/23/2003 1:07 PM To: Guthrie, Jeremy Cc: pgsql-admin@postgresql.org; pgsql-performance@postgresql.org; SF PostgreSQL Subject: RE: [PERFORM] slow table updates On Wed, 2003-07-23 at 10:47, Guthrie, Jeremy wrote: > Have you checked the sizes of your indexes? You may need to rebuild them... > > Multiply the relpages colum by 8192. So, what does this tell me? I'm guessing that you're implying that I should expect 8192 keys per page, and that this therefore indicates the sparseness of the key pages. Guessing that, I did: rkh@csb=> SELECT c2.relname, c2.relpages, c2.relpages*8192 as "*8192", 43413476::real/(c2.relpages*8192) FROM pg_class c, pg_class c2, pg_index i where c.oid = i.indrelid AND c2.oid = i.indexrelid and c2.relname~'^p2th|^papro' ORDER BY c2.relname; relname | relpages | *8192 | ?column? ---------------------------------+----------+------------+-------------------- p2thread_p2params_id | 122912 | 1006895104 | 0.0431161854174633 p2thread_pmodel_id | 123243 | 1009606656 | 0.0430003860830331 paprospect2_redundant_alignment | 229934 | 1883619328 | 0.0230479032332376 What do you make of 'em apples? Thanks, Reece -- Reece Hart, Ph.D. rkh@gene.com, http://www.gene.com/ Genentech, Inc. 650/225-6133 (voice), -5389 (fax) Bioinformatics and Protein Engineering 1 DNA Way, MS-93 http://www.in-machina.com/~reece/ South San Francisco, CA 94080-4990 reece@in-machina.com, GPG: 0x25EC91A0