Space reuse and autovacuum - Mailing list pgsql-hackers

From Bruce Momjian
Subject Space reuse and autovacuum
Date
Msg-id 200506212006.j5LK6qR23759@candle.pha.pa.us
Whole thread Raw
In response to Re: Autovacuum in the backend  (Gavin Sherry <swm@linuxworld.com.au>)
List pgsql-hackers
Gavin Sherry wrote:
> On Wed, 15 Jun 2005, Bruce Momjian wrote:
> 
> >
> > I am going to start working on it.  I am concerned it is a big job.
> >
> > I will post questions as I find them, and the one below is a good one.
> >
> 
> I'm wondering if effort is being misdirected here. I remember when Mark
> Wong at OSDL was running pg_autovacuum during a dbt run, he was seeing
> significant performance loss -- I think on the order of 30% to 40% (I will
> try and dig up a link to the results).
> 
> I think these results can be dramatically improved if the focus is on a
> more effective vacuum.

Let's look at what TODO has for vacuum and how autovacuum fits that:Vacuum======* Improve speed with indexes  For large
tableadjustements during vacuum, it is faster to reindex  rather than update the index.
 

This is something we should figure out how to do automatically.* Reduce lock time by moving tuples with read lock, then
write lock and truncate table  Moved tuples are invisible to other backends so they don't require a  write lock.
However,the read lock promotion to write lock could lead  to deadlock situations.
 

The deadlock problem here seems bad.* -Add a warning when the free space map is too small

Done.
* Maintain a map of recently-expired rows  This allows vacuum to target specific pages for possible free space  without
requiringa sequential scan.
 

I think of this as a secondary Free-space-map (FSM), where instead of
recording rows/pages that have free space, we records rows/pages that
have expired rows that might be free for reuse if all transactions where
the are visible are completed.* Auto-fill the free space map by scanning the buffer cache or by  checking pages written
bythe background writer
 

This could be used to populate the secondary FSM above.
* Create a bitmap of pages that need vacuuming  Instead of sequentially scanning the entire table, have the background
writeror some other process record pages that have expired rows, then  VACUUM can look at just those pages rather than
theentire table.  In  the event of a system crash, the bitmap would probably be invalidated.
 

This is an alternative to the FSM that tracks _all_ possible free space
rather than just a limited amount like a seconary FSM in shared memory.
* Auto-vacuum        o Move into the backend code        o Use free-space map information to guide refilling        o
DoVACUUM FULL if table is nearly empty?
 

It seems no matter what TODO items we complete above, we will need some
type of automatic vacuum to direct filling the free space map.  It might
be done using a different method than a sequential scan vacuum, but it
will be needed, so we are good to integrate autovacuum then improve how
it does its job in future releases.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Strange logic for partial index proving
Next
From: Tom Lane
Date:
Subject: Re: Strange logic for partial index proving