Re: Should a DB vacuum use up a lot of space ? - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Should a DB vacuum use up a lot of space ? |
Date | |
Msg-id | b823401d-0426-f4e4-f927-f858ab107882@aklaver.com Whole thread Raw |
In response to | Re: Should a DB vacuum use up a lot of space ? (Philippe Girolami <philippe.girolami@mosaik.com>) |
Responses |
Re: Should a DB vacuum use up a lot of space ?
(Philippe Girolami <philippe.girolami@mosaik.com>)
|
List | pgsql-general |
On 08/08/2016 12:08 AM, Philippe Girolami wrote: >> So you are VACUUMing the lesser 'younger' tables? > I VACUUM those with the highest age : > SELECT age,array_agg(table_name) FROM (SELECT c.oid::regclass as table_name, greatest(age(c.relfrozenxid),age(t.relfrozenxid))as age FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid WHEREc.relkind IN ('r', 'm') order by 2 desc limit 1000) tt group by age order by age desc > > >>> • I exit singleuser mode and relaunch the server so I keep on vacuuming the oldest table first by copying >>> • But I don’t even have time to launch my script : as soon as launch the server, the auto-vacuum daemon kicks in andburns through the transactions : I guess it’s not smart enough to start with >>> the oldest tables ? >> Not understanding; 'the auto-vacuum daemon kicks in and burns through >> the transactions'. >> Are you saying it is reclaiming xids for you or using them? >> If reclaiming that is what is supposed to do and is good thing. >> Or am I misunderstanding? > Here is what the logs show when I do what I described above > > 1) I got 7 transactions back in single user mode > Aug 7 23:40:57 p2 postgres[30376]: [5-1] 2016-08-07 23:40:57 CEST WARNING: database "public" must be vacuumed within999893 transactions So the above is from when you enter single user mode? > Aug 7 23:40:57 p2 postgres[30376]: [5-2] 2016-08-07 23:40:57 CEST HINT: To avoid a database shutdown, execute a database-wideVACUUM in that database. > I am not seeing what you do in single user mode? > 2) I exit single user mode and restart the database > Aug 7 23:41:40 p2 postgres[15457]: [1-1] 2016-08-07 23:41:40 CEST LOG: database system was shut down at 2016-08-07 23:41:32CEST > Aug 7 23:41:40 p2 postgres[15458]: [1-1] 2016-08-07 23:41:40 CEST LOG: incomplete startup packet > Aug 7 23:41:40 p2 postgres[15459]: [1-1] 2016-08-07 23:41:40 CEST FATAL: the database system is starting up > Aug 7 23:41:40 p2 postgres[15457]: [2-1] 2016-08-07 23:41:40 CEST WARNING: database with OID 16385 must be vacuumed within999892 transactions So you actually lost a transaction. > Aug 7 23:41:40 p2 postgres[15457]: [2-2] 2016-08-07 23:41:40 CEST HINT: To avoid a database shutdown, execute a database-wideVACUUM in that database. > Aug 7 23:41:40 p2 postgres[15457]: [2-3] #011You might also need to commit or roll back old prepared transactions. > Aug 7 23:41:40 p2 postgres[15462]: [1-1] 2016-08-07 23:41:40 CEST LOG: autovacuum launcher started > Aug 7 23:41:40 p2 postgres[15447]: [1-1] 2016-08-07 23:41:40 CEST LOG: database system is ready to accept connections > > 3) but I don’t even have time to run a query to see the state of the databases > Aug 7 23:41:52 p2 postgres[15487]: [2-1] 2016-08-07 23:41:52 CEST ERROR: database is not accepting commands to avoidwraparound data loss in database "public" > Aug 7 23:41:52 p2 postgres[15487]: [2-2] 2016-08-07 23:41:52 CEST HINT: Stop the postmaster and use a standalone backendto vacuum that database. > Aug 7 23:41:52 p2 postgres[15487]: [2-3] #011You might also need to commit or roll back old prepared transactions. > Aug 7 23:41:52 p2 postgres[15487]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT: SELECT datname, age(datfrozenxid) FROM pg_database > Aug 7 23:41:52 p2 postgres[15497]: [2-1] 2016-08-07 23:41:52 CEST ERROR: database is not accepting commands to avoidwraparound data loss in database "public" > Aug 7 23:41:52 p2 postgres[15497]: [2-2] 2016-08-07 23:41:52 CEST HINT: Stop the postmaster and use a standalone backendto vacuum that database. > Aug 7 23:41:52 p2 postgres[15497]: [2-3] #011You might also need to commit or roll back old prepared transactions. > Aug 7 23:41:52 p2 postgres[15497]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT: COPY ( SELECT 'VACUUM VERBOSE ' ||c.oid::regclass||';--' as _command, greatest ( age ( c.relfrozenxid ) ,age ( t.relfrozenxid ) ) as age FROM pg_class cLEFT JOIN pg_class t ON c.reltoastrelid = t.oid WHERE c.relkind IN ( 'r' , 'm' ) order by 2 desc limit 1000 ) TO STDOUT(FORMAT text) > Aug 7 23:41:52 p2 postgres[15518]: [2-1] 2016-08-07 23:41:52 CEST ERROR: database is not accepting commands to avoidwraparound data loss in database "public" > Aug 7 23:41:52 p2 postgres[15518]: [2-2] 2016-08-07 23:41:52 CEST HINT: Stop the postmaster and use a standalone backendto vacuum that database. > Aug 7 23:41:52 p2 postgres[15518]: [2-3] #011You might also need to commit or roll back old prepared transactions. > Aug 7 23:41:52 p2 postgres[15518]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT: SELECT datname, age(datfrozenxid) FROM pg_database > Aug 7 23:41:53 p2 rsyslogd-2177: imuxsock lost 3512 messages from pid 15476 due to rate-limiting Hmm I wonder what pid 15476(see below) is trying to do that is blowing out the logging? > Aug 7 23:41:53 p2 postgres[15476]: [2-1] 2016-08-07 23:41:53 CEST ERROR: database is not accepting commands to avoidwraparound data loss in database "public" > Aug 7 23:41:53 p2 postgres[15476]: [2-2] 2016-08-07 23:41:53 CEST HINT: Stop the postmaster and use a standalone backendto vacuum that database. > Aug 7 23:41:53 p2 postgres[15476]: [2-3] #011You might also need to commit or roll back old prepared transactions. > Aug 7 23:41:53 p2 postgres[15476]: [2-4] 2016-08-07 23:41:53 CEST CONTEXT: automatic analyze of table "public.public.aaaaaa" > > > 4) lots of autovacuum failures It is trying to analyze tables and being refused as are all commands. > Aug 7 23:41:55 p2 postgres[15476]: [3-1] 2016-08-07 23:41:55 CEST ERROR: database is not accepting commands to avoidwraparound data loss in database "public" > Aug 7 23:41:55 p2 postgres[15476]: [3-2] 2016-08-07 23:41:55 CEST HINT: Stop the postmaster and use a standalone backendto vacuum that database. > Aug 7 23:41:55 p2 postgres[15476]: [3-3] #011You might also need to commit or roll back old prepared transactions. > Aug 7 23:41:55 p2 postgres[15476]: [3-4] 2016-08-07 23:41:55 CEST CONTEXT: automatic analyze of table "public.public.aaaaab" > Aug 7 23:41:56 p2 postgres[15476]: [4-1] 2016-08-07 23:41:56 CEST ERROR: database is not accepting commands to avoidwraparound data loss in database "public" > Aug 7 23:41:56 p2 postgres[15476]: [4-2] 2016-08-07 23:41:56 CEST HINT: Stop the postmaster and use a standalone backendto vacuum that database. > Aug 7 23:41:56 p2 postgres[15476]: [4-3] #011You might also need to commit or roll back old prepared transactions. > Aug 7 23:41:56 p2 postgres[15476]: [4-4] 2016-08-07 23:41:56 CEST CONTEXT: automatic analyze of table "public.public.aaaaac" > Aug 7 23:41:56 p2 postgres[15476]: [5-1] 2016-08-07 23:41:56 CEST ERROR: database is not accepting commands to avoidwraparound data loss in database "public" > Aug 7 23:41:56 p2 postgres[15476]: [5-2] 2016-08-07 23:41:56 CEST HINT: Stop the postmaster and use a standalone backendto vacuum that database. > Aug 7 23:41:56 p2 postgres[15476]: [5-3] #011You might also need to commit or roll back old prepared transactions. > (…) > >>> So it looks like I’m going to have to bite the bullet and really vacuum the whole database instead of just the oldesttables first which will impact our production pipelines. >> Or VACUUM the most heavily used tables in the database. > Why would that help ? Aren’t the tables with the highest age the problem ? I think we are saying the same thing. The tables that have accumulated the most xid debt. > >> The next question to be asked is; what is creating the transactions and >> is the transaction rate 'normal' or is there a possibility you have a >> rogue process or rogue processes in action? > That’s always a possibility but I can’t think of what that would be. > I was guessing that the auto-vacuum wasn’t vacuuming the tables with the highest age first. Isn’t that a possibility ? Looks to me like it never gets a chance in normal mode to get its work done. As mentioned before you need to create some headroom to work with. The logical place to do that would be in single user mode where other sessions cannot interfere. > > Thanks > > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: