Thread: How to determine whether to VACUUM or CLUSTER
Hi All,
I have an app that updates a PostgreSQL db in a batch fashion. After each batch (or several batches), it issues VACUUM and ANALYZE calls on the updated tables. Now I want to cluster some tables for better performance. I understand that doing a VACUUM and a CLUSTER on a table is wasteful as the CLUSTER makes the VACUUM superfluous. The app does not have a built-in list of the tables and whether each is clustered or not. It looks to me as if the only way to determine whether to issue a VACUUM (on a non-clustered table) or a CLUSTER (on a clustered table) is to query the table "pg_index", much like view "pg_indexes" does, for the column "indisclustered". Is this right?
Also, how expensive is CLUSTER compared to VACUUM? Does CLUSTER read in the whole table, sort it, and write it back out? Or write out a completely new file? Is the time for a CLUSTER the same whether one row is out of place or the table is completely disordered?
Thanks,
Ken
Discover Yahoo!
Find restaurants, movies, travel & more fun for the weekend. Check it out!
ken shaw <kshaw987@yahoo.com> writes: > It looks to me as if the only way to determine whether to issue a > VACUUM (on a non-clustered table) or a CLUSTER (on a clustered table) > is to query the table "pg_index", much like view "pg_indexes" does, > for the column "indisclustered". Is this right? indisclustered is certainly the ground truth here, and [ ... digs around in the source code ... ] it doesn't look like there are any views that present the information in a different fashion. So yup, that's what you gotta do. > Also, how expensive is CLUSTER compared to VACUUM? Well, it's definitely expensive compared to plain VACUUM, but compared to VACUUM FULL the case is not clear-cut. I would say that if you had a seriously bloated table (where VACUUM FULL would have to move all or most of the live tuples in order to compact the table completely) then CLUSTER will be faster --- not to mention any possible future benefits from having the table more or less in order with respect to the index. As near as I can tell, VACUUM FULL was designed to work nicely when you had maybe 10%-25% free space in the table and you want it all compacted out. In a scenario where it has to move all the tuples it is certainly not faster than CLUSTER; plus the end result is much worse as far as the state of the indexes goes, because VACUUM FULL does *nothing* for compacting indexes. regards, tom lane