Re: Auto Vacuum - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Auto Vacuum
Date
Msg-id 200411290229.iAT2TjB07277@candle.pha.pa.us
Whole thread Raw
In response to Auto Vacuum  (Russell Smith <mr-russ@pws.com.au>)
Responses Re: Auto Vacuum  (Gavin Sherry <swm@linuxworld.com.au>)
Re: Auto Vacuum  ("Matthew T. O'Connor" <matthew@zeut.net>)
List pgsql-hackers
I have added an auto-vacuum TODO item:

* Auto-vacuum       o Move into the backend code       o Scan the buffer cache to find free space or use background
writer      o Use free-space map information to guide refilling
 


---------------------------------------------------------------------------

Russell Smith wrote:
> Hi All,
> 
> I am doing serious thinking about the implementation of Auto Vacuum as part of the backend, Not using libpq, but
classinginternal functions directly.
 
> It appears to me that calling internal functions directly is a better implementation than using the external library
todo the job.
 
> 
> I know I might be stepping on Matthew's toes, but I don't really want to.  I am a complete newbie to the postgresql
code,however I am trying.
 
> Vacuum appears to be one of the bigger saw points with administrator having to configure it via scheduled tasks.
> 
> I have outlined things I have thought about below.  I've surely missed a lot, and am open to feedback.  Others may
likethe current tuning used
 
> by the external autovacuum, however to have stats that are collected at vacuum, and used for later vacuum would mean
wedon't need the
 
> stats collector running.
> 
> The major autovacuum issues 
> 
> 1. Transaction Wraparound
> 2. Vacuum of relations
> 3. Tracking of when to do vacuums
> 4. Where to store information needed by auto vacuum
> 
> I would suggest the first step is to replicate the wraparound and relation vacuuming from the current code first. But
Icould be wrong.
 
> Currently there are a lot of tuning options in the external autovacuum, and the best way to run vacuum internally
willneed to be thought out.
 
> 
> 1. Transaction Wraparound
> 
> It appears from the code, that the best time to do a transaction wraparound db wide vacuum is when the
> frozenXID has wrapped, TransactionIdPrecedes(myXID, dbform->datfrozenxid).  And is probably the most
> simple vacuum to implement.
> 
> 
> 2. Vacuuming of relations
> 
> Currently, the entire heap must be vacuumed at one time.  I would possible be desireable to have only part of the
relationvacuumed at
 
> a time.  If you can find out which parts of the relation have the most slack space.  There is a todo item regarding
trackingrecent deletions
 
> so they can be resused.  Some form of this would be helpful to work out what to vacuum.  Performance issues for this
typeof activity 
 
> may be a concern.  But I have no experience to be able to make comment on them.  So I welcome yours.
> 
> 3. Tracking of when to vacuum
> 
> Current autovacuum relies the stats collector to be running.  I would like to only use the stats if they are
available,
> and have an option to be able to vacuum accurately without having to have stats running.
> 
> By adding certain histograms, on tuples, filesize and slack space we can guage the time between vacuums
> 
> number of tuples will show the inserts as opposed to updates.
> file size will show that the file is growing and by how much between vacuums.
> slack space will show the delete/updated records.
> 
> A new guc and relation option would need to be implemented to give a target slack space in a file.
> this will help to reduce growth in relations if vacuum happens to not run frequently enough.  This information
> can also inform autovacuum that it should be vacuuming more frequently.  The number would be a percentage,
> eg 10% of the total file size is allowed to be unused.  Also alter table would allow users to set levels of
slackness
> for each relation.  If the table gets too much more than the target slack space, a "partial" vacuum full could be
run
> to reduce the size of the table by moving tuples at the end of the table to slack space near the beginning and 
> shortening the table length.  It would require a full table lock, but you may be able to space it out, to only do a
page
> at a time.
> 
> /* target percentage of slack space */
> vacuum_default_target_slack_space = 10
> 
> ALTER TABLE SET TARGET SLACK SPACE = 10;
> 
> 4. Where to store information required by auto vacuum.
> 
> Auto vacuum needs somewhere to stop the information it needs about current and future vacuuming.  I am unsure of
where
> to put this.  It appears as if it will have a number of fields.  I feel like pg_class is the best place to put the
information,but 
 
> again I have no idea.
> 
> That's the best I can do for now.  I can clarify things further if required.
> 
> Regards
> 
> Russell.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Status of server side Large Object support?
Next
From: Gavin Sherry
Date:
Subject: Re: Auto Vacuum