Re: Adding more space, and a vacuum question. - Mailing list pgsql-general
From | Herouth Maoz |
---|---|
Subject | Re: Adding more space, and a vacuum question. |
Date | |
Msg-id | 60FEBA71-704E-4493-AFB5-B1A3F0A759E9@unicell.co.il Whole thread Raw |
In response to | Adding more space, and a vacuum question. ("Herouth Maoz" <herouth@unicell.co.il>) |
Responses |
Re: Adding more space, and a vacuum question.
|
List | pgsql-general |
On 30/01/2011, at 13:03, Alban Hertroys wrote:
On 28 Jan 2011, at 22:12, Herouth Maoz wrote:2. That database has a few really huge tables. I think they are not being automatically vacuumed properly. In the past few days I've noticed a vacuum process on one of them which has been running since January 14th. Unfortunately, it never finished, because we were informed of a scheduled power down in our building yesterday, and had to shut down the machine. The questions are:a. Is it normal for vacuum processes to take two weeks?For a 200M record table that's definitely on the long side. It was probably waiting on a lock by another transaction. In most cases that means that some transaction was kept open for that duration.If that transaction came into existence by accident, then vacuum should be fine now that the server has restarted - that transaction is gone now. You may want to keep an eye out for long-running transactions though, that's usually a programming error - it's sometimes done deliberately, but it's still a bad idea from the point of the database.
Unless my eyes were deceiving me, this was not the case. Sure, there have been heavy transactions during that time (e.g. the daily backup of the database, and the daily inserts into other tables, which take a long time, and a few selects which I haven't been able to find an optimal index for). But this is the query I use to see these processes (ran from a superuser):
SELECT usename, procpid, query_start, client_addr, client_port, current_query,waiting
FROM pg_stat_activity
WHERE query_start < now() - interval '3 seconds'
AND xact_start is not null order by xact_start
Any long transactions should be caught by it, but most of the time, all I see are vacuum workers.
By the way, the auto vacuum on that table has started again - but only after more records were deleted from it. It has now been running since yesterday at 17:00. Here is the pg_stat_user_tables record for this table (which has also updated after the deletes):
relid | 17806
schemaname | sms
relname | billing__archive
seq_scan | 9
seq_tup_read | 2053780855
idx_scan | 2553
idx_tup_fetch | 8052678
n_tup_ins | 11437874
n_tup_upd | 0
n_tup_del | 7987450
n_tup_hot_upd | 0
n_live_tup | 218890768
n_dead_tup | 33710378
last_vacuum |
last_autovacuum |
last_analyze | 2011-01-29 15:29:37.059176+02
last_autoanalyze |
In older PG versions autovacuum could get stuck like that on large tables. It keeps starting over trying to vacuum that same table, but never reaches the end of it. Since it's only a single worker process (in those versions), it also will never vacuum any tables beyond the table it got stuck on.
How old? Mine is 8.3.11.
If you don't delete or update tuples a lot, then the tables are probably just that big. If you do delete/update them regularly, try if a normal vacuum will shrink them enough (probably not) and if not, schedule a VACUUM FULL and a REINDEX at some time the database isn't too busy. Both are quite heavy operations that take exclusive locks on things (tables, indices).
Yes, I do delete many tuples from that table. My mode of usage is like this: I have a small table called billing which receives new data every night. I want to keep that table small so that those nightly updates don't take an overly long time, because all data (several such tables) has to be ready in the database by the next morning. Therefore, once a week on the weekend, I move a week's worth of data to billing__archive (the table we are discussing), and delete a week's worth from its end. Now, the indexes on that table would make this impossible to do within the weekend, so what I do is drop all the indexes before I do the inserts, and then recreate them, and then do the deletes.
What you are saying is that in this mode of operation, there's basically no hope that autovacuum will ever salvage the deleted records? Does removing and recreating the indexes have any effect on the vacuuming process?
If a vacuum takes me several days (let alone over a week!) than a VACUUM FULL is out of the question. VACUUM FULL locks the table completely and that table is essential to our customer care. If push comes to shove, I think I'd rather dump that table, drop it, and restore it over the weekend, which I believe will be faster than a VACUUM FULL.
One other important question: a tuple marked by VACUUM as reusable (not VACUUM FULL which restores it to the operating system) - can its space ever be used by another table, or can it only be used for new inserts into the same table?
d. After restarting the server, all the data in pg_stat_user_tables seem to have been reset. What does this mean and how does this affect vacuum scheduling?I recall reading somewhere that that's normal; probably this list. It probably contained an explanation too.
Thank you.
Herouth
pgsql-general by date: