> One of my customer today is reducing one table from 140GB to 20GB. Now he > is able to run archiving. He should play with pg_repack, and it is working > well today, but I ask myself, what pg_repack does not be hard to do > internally because it should be done for REINDEX CONCURRENTLY. This is not > a common task, and not will be, but on the other hand, it can be nice to > have feature, and maybe not too hard to implement today. But I didn't try it
FWIW a newer, more modern and more trustworthy alternative to pg_repack is pg_squeeze, which I discovered almost by random chance, and soon discovered I liked it much more.
So thinking about your question, I think it might be possible to integrate a tool that works like pg_squeeze, such that it runs when VACUUM is invoked -- either under some new option, or just replace the code under FULL, not sure. If the Cybertec people allows it, we could just grab the pg_squeeze code and add it to the things that VACUUM can run.
Now, pg_squeeze has some additional features, such as periodic "squeezing" of tables. In a first attempt, for simplicity, I would leave that stuff out and just allow it to run from the user invoking it, and then have the command to do a single run. (The scheduling features could be added later, or somehow integrated into autovacuum, or maybe something else.)
some basic variant (without autovacuum support) can be good enough. We have no autovacuum support for REINDEX CONCURRENTLY and I don't see a necessity for it (sure, it can be limited by my perspective) . The necessity of reducing table size is not too common (a lot of use cases are better covered by using partitioning), but sometimes it is, and then buildin simple available solution can be helpful.
-- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "We're here to devour each other alive" (Hobbes)