Hello,
We have here a postgres DB receiving daily few giga of text data, from a production system, which has to be merged into tables (insert new rows & update existing) on a Postgres database. Then we run aggregation queries to update reports. The database size will grow regularly over time as we have to keep at least 2 years of history
The thing is that we don’t need to keep old versions of rows (to save disk space, get the highest level of performances, and this is not for a transactional application).
So currently I am doing full vacuums + recollect stats daily at the end of the loading process
Is there another way (in particular avoiding the full vacuum & save loading time) to not keep or clean these old versions of the updated rows. There is a full backup done regularly + procedures for recovery. But I am not very familiar with auto vacuum settings so I don’t know what to choose.
For your information, biggest tables are partitioned by month so they contains not more than few hundred mega, but bulk updates impacts old partitions.
Best regards,
Julien Theulier