Re: Should a DB vacuum use up a lot of space ? - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Should a DB vacuum use up a lot of space ?
Date
Msg-id b823401d-0426-f4e4-f927-f858ab107882@aklaver.com
Whole thread Raw
In response to Re: Should a DB vacuum use up a lot of space ?  (Philippe Girolami <philippe.girolami@mosaik.com>)
Responses Re: Should a DB vacuum use up a lot of space ?  (Philippe Girolami <philippe.girolami@mosaik.com>)
List pgsql-general
On 08/08/2016 12:08 AM, Philippe Girolami wrote:
>> So you are VACUUMing the lesser 'younger' tables?
> I VACUUM those with the highest age :
> SELECT age,array_agg(table_name) FROM (SELECT c.oid::regclass as table_name,
greatest(age(c.relfrozenxid),age(t.relfrozenxid))as age FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHEREc.relkind IN ('r', 'm') order by 2 desc limit 1000) tt group by age order by age desc 
>
>
>>> • I exit singleuser mode and relaunch the server so I keep on vacuuming the oldest table first by copying
>>> • But I don’t even have time to launch my script : as soon as launch the server, the auto-vacuum daemon kicks in
andburns through the transactions : I guess it’s not smart enough to start with 
>>> the oldest tables ?
>> Not understanding; 'the auto-vacuum daemon kicks in and burns through
>> the transactions'.
>> Are you saying it is reclaiming xids for you or using them?
>> If reclaiming that is what is supposed to do and is good thing.
>> Or am I misunderstanding?
> Here is  what the logs show when I do what I described above
>
> 1) I got 7 transactions back in single user mode
> Aug  7 23:40:57 p2 postgres[30376]: [5-1] 2016-08-07 23:40:57 CEST WARNING:  database "public" must be vacuumed
within999893 transactions 

So the above is from when you enter single user mode?

> Aug  7 23:40:57 p2 postgres[30376]: [5-2] 2016-08-07 23:40:57 CEST HINT:  To avoid a database shutdown, execute a
database-wideVACUUM in that database. 
>

I am not seeing what you do in single user mode?



> 2) I exit single user mode and restart the database
> Aug  7 23:41:40 p2 postgres[15457]: [1-1] 2016-08-07 23:41:40 CEST LOG:  database system was shut down at 2016-08-07
23:41:32CEST 
> Aug  7 23:41:40 p2 postgres[15458]: [1-1] 2016-08-07 23:41:40 CEST LOG:  incomplete startup packet
> Aug  7 23:41:40 p2 postgres[15459]: [1-1] 2016-08-07 23:41:40 CEST FATAL:  the database system is starting up
> Aug  7 23:41:40 p2 postgres[15457]: [2-1] 2016-08-07 23:41:40 CEST WARNING:  database with OID 16385 must be vacuumed
within999892 transactions 

So you actually lost a transaction.

> Aug  7 23:41:40 p2 postgres[15457]: [2-2] 2016-08-07 23:41:40 CEST HINT:  To avoid a database shutdown, execute a
database-wideVACUUM in that database. 
> Aug  7 23:41:40 p2 postgres[15457]: [2-3] #011You might also need to commit or roll back old prepared transactions.
> Aug  7 23:41:40 p2 postgres[15462]: [1-1] 2016-08-07 23:41:40 CEST LOG:  autovacuum launcher started
> Aug  7 23:41:40 p2 postgres[15447]: [1-1] 2016-08-07 23:41:40 CEST LOG:  database system is ready to accept
connections
>
> 3) but I don’t even have time to run a query to see the state of the databases
> Aug  7 23:41:52 p2 postgres[15487]: [2-1] 2016-08-07 23:41:52 CEST ERROR:  database is not accepting commands to
avoidwraparound data loss in database "public" 
> Aug  7 23:41:52 p2 postgres[15487]: [2-2] 2016-08-07 23:41:52 CEST HINT:  Stop the postmaster and use a standalone
backendto vacuum that database. 
> Aug  7 23:41:52 p2 postgres[15487]: [2-3] #011You might also need to commit or roll back old prepared transactions.
> Aug  7 23:41:52 p2 postgres[15487]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT:  SELECT datname, age(datfrozenxid) FROM
pg_database
> Aug  7 23:41:52 p2 postgres[15497]: [2-1] 2016-08-07 23:41:52 CEST ERROR:  database is not accepting commands to
avoidwraparound data loss in database "public" 
> Aug  7 23:41:52 p2 postgres[15497]: [2-2] 2016-08-07 23:41:52 CEST HINT:  Stop the postmaster and use a standalone
backendto vacuum that database. 
> Aug  7 23:41:52 p2 postgres[15497]: [2-3] #011You might also need to commit or roll back old prepared transactions.
> Aug  7 23:41:52 p2 postgres[15497]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT:  COPY  ( SELECT 'VACUUM VERBOSE '
||c.oid::regclass||';--' as _command, greatest ( age ( c.relfrozenxid ) ,age ( t.relfrozenxid ) ) as age FROM pg_class
cLEFT JOIN pg_class t ON c.reltoastrelid = t.oid WHERE c.relkind IN ( 'r' , 'm' ) order by 2 desc limit 1000 ) TO
STDOUT(FORMAT text) 
> Aug  7 23:41:52 p2 postgres[15518]: [2-1] 2016-08-07 23:41:52 CEST ERROR:  database is not accepting commands to
avoidwraparound data loss in database "public" 
> Aug  7 23:41:52 p2 postgres[15518]: [2-2] 2016-08-07 23:41:52 CEST HINT:  Stop the postmaster and use a standalone
backendto vacuum that database. 
> Aug  7 23:41:52 p2 postgres[15518]: [2-3] #011You might also need to commit or roll back old prepared transactions.
> Aug  7 23:41:52 p2 postgres[15518]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT:  SELECT datname, age(datfrozenxid) FROM
pg_database
> Aug  7 23:41:53 p2 rsyslogd-2177: imuxsock lost 3512 messages from pid 15476 due to rate-limiting

Hmm I wonder what pid 15476(see below) is trying to do that is blowing
out the logging?

> Aug  7 23:41:53 p2 postgres[15476]: [2-1] 2016-08-07 23:41:53 CEST ERROR:  database is not accepting commands to
avoidwraparound data loss in database "public" 
> Aug  7 23:41:53 p2 postgres[15476]: [2-2] 2016-08-07 23:41:53 CEST HINT:  Stop the postmaster and use a standalone
backendto vacuum that database. 
> Aug  7 23:41:53 p2 postgres[15476]: [2-3] #011You might also need to commit or roll back old prepared transactions.
> Aug  7 23:41:53 p2 postgres[15476]: [2-4] 2016-08-07 23:41:53 CEST CONTEXT:  automatic analyze of table
"public.public.aaaaaa"
>
>
> 4) lots of autovacuum failures

It is trying to analyze tables and being refused as are all commands.

> Aug  7 23:41:55 p2 postgres[15476]: [3-1] 2016-08-07 23:41:55 CEST ERROR:  database is not accepting commands to
avoidwraparound data loss in database "public" 
> Aug  7 23:41:55 p2 postgres[15476]: [3-2] 2016-08-07 23:41:55 CEST HINT:  Stop the postmaster and use a standalone
backendto vacuum that database. 
> Aug  7 23:41:55 p2 postgres[15476]: [3-3] #011You might also need to commit or roll back old prepared transactions.
> Aug  7 23:41:55 p2 postgres[15476]: [3-4] 2016-08-07 23:41:55 CEST CONTEXT:  automatic analyze of table
"public.public.aaaaab"
> Aug  7 23:41:56 p2 postgres[15476]: [4-1] 2016-08-07 23:41:56 CEST ERROR:  database is not accepting commands to
avoidwraparound data loss in database "public" 
> Aug  7 23:41:56 p2 postgres[15476]: [4-2] 2016-08-07 23:41:56 CEST HINT:  Stop the postmaster and use a standalone
backendto vacuum that database. 
> Aug  7 23:41:56 p2 postgres[15476]: [4-3] #011You might also need to commit or roll back old prepared transactions.
> Aug  7 23:41:56 p2 postgres[15476]: [4-4] 2016-08-07 23:41:56 CEST CONTEXT:  automatic analyze of table
"public.public.aaaaac"
> Aug  7 23:41:56 p2 postgres[15476]: [5-1] 2016-08-07 23:41:56 CEST ERROR:  database is not accepting commands to
avoidwraparound data loss in database "public" 
> Aug  7 23:41:56 p2 postgres[15476]: [5-2] 2016-08-07 23:41:56 CEST HINT:  Stop the postmaster and use a standalone
backendto vacuum that database. 
> Aug  7 23:41:56 p2 postgres[15476]: [5-3] #011You might also need to commit or roll back old prepared transactions.
> (…)
>
>>> So it looks like I’m going to have to bite the bullet and really vacuum the whole database instead of just the
oldesttables first which will impact our production pipelines. 
>> Or VACUUM the most heavily used tables in the database.
> Why would that help ? Aren’t the tables with the highest age the problem ?

I think we are saying the same thing. The tables that have accumulated
the most xid debt.

>
>>    The next question to be asked is; what is creating the transactions and
>>    is the transaction rate 'normal' or is there a possibility you have a
>>    rogue process or rogue processes in action?
> That’s always a possibility but I can’t think of what that would be.
> I was guessing that the auto-vacuum wasn’t vacuuming the tables with the highest age first. Isn’t that a possibility
?

Looks to me like it never gets a chance in normal mode to get its work
done. As mentioned before you need to create some headroom to work with.
The logical place to do that would be in single user mode where other
sessions cannot interfere.

>
> Thanks
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
Next
From: Kevin Grittner
Date:
Subject: Re: Detecting if current transaction is modifying the database