VACUUM FULL taking accessivly long. - Mailing list pgsql-hackers

From Darcy Buskermolen
Subject VACUUM FULL taking accessivly long.
Date
Msg-id 200409141406.32352.darcy@wavefire.com
Whole thread Raw
List pgsql-hackers
While doing some testing for someone, I have run across what appears to be a 
bug that is only readily reproduceable in FreeBSD 4.x  (dual Xeon 2.4 GHz, 
2GB RAM, PG 7.4).  The same test run on a pair of linux boxen, work as 
expected.

This process has now been running 25 hours, is still running, and if left will 
run indefantaly (I aborted that last one after it was churning away for 7 
days, so that I could rebuild PG with debugging symbols)


There is no other access to this DB, let alone this table.


darcy=# \d archives_event         Table "public.archives_event" Column   |           Type           | Modifiers 
-----------+--------------------------+-----------id        | integer                  | sid       | integer
     | gid       | integer                  | cid       | integer                  | ts_old    | integer
 | num_child | integer                  | timestamp | timestamp with time zone | 
 
Indexes:   "ae_tsold_num_child_idx" btree (num_child, ts_old)   "archives_event_ts_num_child_idx" btree (num_child,
"timestamp")
darcy=#  VACUUM FULL verbose archives_event;
INFO:  vacuuming "public.archives_event"
INFO:  "archives_event": found 34879340 removable, 90000000 nonremovable row 
versions in 1330139 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 60 to 60 bytes long.
There were 55220660 unused item pointers.
Total free space (including removable row versions) is 4749495908 bytes.
621477 pages are or will become empty, including 0 at the end of the table.
621479 pages containing 4718314868 free bytes are potential move destinations.
CPU 41.48s/6.00u sec elapsed 123.21 sec.
INFO:  index "archives_event_ts_num_child_idx" now contains 90000000 row 
versions in 592668 pages
DETAIL:  3753431 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 56.75s/26.49u sec elapsed 1888.69 sec.
INFO:  index "ae_tsold_num_child_idx" now contains 90000000 row versions in 
493544 pages
DETAIL:  34879340 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 66.63s/192.01u sec elapsed 1983.45 sec.


$ps
USER     PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
pgsql   9820  2.2 20.4 438656 427300  ??  D    Mon08AM  51:49.86 postmaster: 
darcy darcy [local] VACUUM (postgres)
$iostat -w1     tty             ad0              da0             acd0             cputin tout  KB/t tps  MB/s   KB/t
tps MB/s   KB/t tps  MB/s  us ni sy in id  0   19  0.00   0  0.00  16.71 785 12.81   0.00   0  0.00   0  0  1  0 99  0
19  0.00   0  0.00  21.94 184  3.94   0.00   0  0.00   1  0  0  0 99  0   19 12.67   6  0.07  21.59 186  3.92   0.00
0 0.00   1  0  2  0 97
 
^C

The da0 array is capable of 60MB/second random IO write.

I've attached the the process and obtained the following back trace:

(gdb) bt 
#0  0x28377088 in read () from /usr/lib/libc.so.4
#1  0x815b2f8 in FileRead (file=36, buffer=0x302c7520 "s\001", amount=8192) at 
fd.c:904
#2  0x81661d0 in mdread (reln=0x938a01c, blocknum=467914, buffer=0x302c7520 
"s\001") at md.c:434
#3  0x8166f5f in smgrread (which=0, reln=0x938a01c, blocknum=467914, 
buffer=0x302c7520 "s\001") at smgr.c:316
#4  0x8157fc3 in ReadBufferInternal (reln=0x938a01c, blockNum=467914, 
bufferLockHeld=0 '\000') at bufmgr.c:224
#5  0x8157db2 in ReadBuffer (reln=0x938a01c, blockNum=467914) at bufmgr.c:118
#6  0x8088842 in _bt_getbuf (rel=0x938a01c, blkno=467914, access=1) at 
nbtpage.c:401
#7  0x808b45f in btbulkdelete (fcinfo=0xbfbfe8bc) at nbtree.c:674
#8  0x81ca5ba in OidFunctionCall3 (functionId=332, arg1=154705948, 
arg2=135251912, arg3=3217025732) at fmgr.c:1260
#9  0x8084a5c in index_bulk_delete (indexRelation=0x938a01c, 
callback=0x80fc7c8 <tid_reaped>, callback_state=0xbfbfeac4) at indexam.c:618
#10 0x80fc666 in vacuum_index (vacpagelist=0xbfbfeac4, indrel=0x938a01c, 
num_tuples=90000000, keep_tuples=0) at vacuum.c:2702
#11 0x80f893b in full_vacuum_rel (onerel=0x833ed30, vacstmt=0x82f5354) at 
vacuum.c:944
#12 0x80f8818 in vacuum_rel (relid=84356092, vacstmt=0x82f5354, 
expected_relkind=114 'r') at vacuum.c:853
#13 0x80f7f55 in vacuum (vacstmt=0x82f5354) at vacuum.c:290
#14 0x816d3d5 in ProcessUtility (parsetree=0x82f5354, dest=0x82f5234, 
completionTag=0xbfbfecd4 "") at utility.c:827
#15 0x816bf8b in PortalRunUtility (portal=0x834501c, query=0x82f539c, 
dest=0x82f5234, completionTag=0xbfbfecd4 "") at pquery.c:758
#16 0x816c08b in PortalRunMulti (portal=0x834501c, dest=0x82f5234, 
altdest=0x82f5234, completionTag=0xbfbfecd4 "") at pquery.c:822
#17 0x816bbc2 in PortalRun (portal=0x834501c, count=2147483647, 
dest=0x82f5234, altdest=0x82f5234, completionTag=0xbfbfecd4 "") at 
pquery.c:483
#18 0x8168c9b in exec_simple_query (query_string=0x82f501c " VACUUM FULL 
verbose archives_event;") at postgres.c:873
#19 0x816af53 in PostgresMain (argc=4, argv=0x82b84b4, username=0x82b847c 
"darcy") at postgres.c:2868
#20 0x8147a11 in BackendFork (port=0x82bc200) at postmaster.c:2558
#21 0x8147247 in BackendStartup (port=0x82bc200) at postmaster.c:2201
#22 0x8145bae in ServerLoop () at postmaster.c:1113
#23 0x814570d in PostmasterMain (argc=1, argv=0x82b6030) at postmaster.c:891
#24 0x811a885 in main (argc=1, argv=0xbfbffcac) at main.c:214


Please let me know if you require any other information.



-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com


pgsql-hackers by date:

Previous
From: Oliver Elphick
Date:
Subject: Re: NEW used in a query that is not in a rule
Next
From: "Jim C. Nasby"
Date:
Subject: Re: APR 1.0 released