Version=3D9.1.7
INFO: clustering "my_cool_table" using sequential scan and sort
INFO: "my_cool_table": found 1 removable, 1699139 nonremovable row versions=
in 49762 pages
Detail: 1689396 dead row versions cannot be removed yet.
CPU 9.80s/4.98u sec elapsed 175.92 sec.
INFO: clustering "my_cool_table" using sequential scan and sort
INFO: "my_cool_table": found 7552 removable, 21732 nonremovable row version=
s in 50007 pages
Detail: 11482 dead row versions cannot be removed yet.
CPU 0.01s/0.23u sec elapsed 36.29 sec.
INFO: clustering "my_cool_table" using index scan on "pk_cool"
INFO: "my_cool_table": found 621462 removable, 36110 nonremovable row versi=
ons in 26135 pages
Detail: 25128 dead row versions cannot be removed yet.
CPU 0.02s/0.35u sec elapsed 0.79 sec.
So my_cool_table gets inserted into (but not updated) by regular processes =
doing their smallish CRUD transactions.
Concurrently, ONE process repeatedly "sweeps" a chunk of rows from the tabl=
e every few seconds.
(ie, it does delete...returning, and then commits the sweep)
Note that if the table has not many rows, then all the rows will be swept t=
ogether.
It is possible for something to go wrong resulting in:
the table still being filled, but no longer being swept.
When the sweeping finally gets re-started, it must now chomp down a very la=
rge table.
When it finally sweeps down to near zero rows remaining, my idea was to do =
a CLUSTER on the table.
My expectation is that a VERY SMALL percentage of the row versions would ac=
tually get written to the new table!
My hope is that a smaller heap is better, now that the rate of sweeping is =
back to the rate of filling,
with the assumption that it will stay this way 99% of the time.
Can somebody tell me why some "dead row versions cannot be removed yet" ?
I assume that means CLUSTER must write them to the new table ?
It seems very costly to do the CLUSTER, if the new table is not really goin=
g to be a tiny fraction of the old table.
Is there a way for me to discover the approx number of "non-removables" BEF=
ORE I do the CLUSTER ?
? Some pg_table query ? maybe after an analyze ?
Also, does the use of [index scan on "pk_cool"] basically depend on the rat=
io of removable/nonremovable row versions ?
Thanks,
-dvs-