Re: 'Real' auto vacuum? - Mailing list pgsql-performance

From Ralph Mason
Subject Re: 'Real' auto vacuum?
Date
Msg-id 4314EE25.30605@telogis.com
Whole thread Raw
In response to Re: 'Real' auto vacuum?  (mark@mark.mielke.cc)
List pgsql-performance
mark@mark.mielke.cc wrote:

>But, yeah. It's probably not that easy, especially with really big
>databases. Where is this free list stored? How efficient is it to keep
>track of the lowest running transaction at all times? How does one
>synchronize access to this free list, to ensure that processes don't
>block up waiting for access to the free list? Is the fre list
>journalled to prevent corruption, and the accidental re-use of a still
>in use row? And, there would be a cost to scanning this list on every
>insert or update.
>
>
I suspect the freelist could be stored as an index, and just handily
postgres supports those out of the box.    There would be a cost yes,
but then what is the cost of adding pages to the file all the time?  I
guess as with all things there is no one size fits all, so perhaps you
could turn it off - although I expect for 99.9% of the cases 'on' would
be the better choice. If it gets broken there is already the reindex
code that can fix it. A coherency / fixing / recover of a table command
would probably be a useful tool anyway.

>As an outsider (like you?) I see the current model as a design flaw as
>well. A neat and tidy model on paper. Not so nice in real life. The
>need to vacuum in batch mode, to keep the database from dying, seems
>intuitively bad.
>
>
We have a script that vacuums the database every 5 minutes, excessive -
yes, but turns out that any less is no good really.  I think that this
is sub optimal, the DB work keeps running, but the vacuum can slow down
other tasks.  It also probably flushes data that we would need out of
the page cache so it can look at data that isn't used often as the
vacuum runs. Not the most optimal data access pattern I could imagine.

>I think there must be answers to this problem. Even simple
>optimizations, such as defining a table such that any delete or update
>within a table, upon commit, will attempt to vacuum just the rows that
>should not be considered free for any new transactions. If it's in
>use by an active transaction, oh well. It can be picked up by a batch
>run of vacuum. If it's free though - let's do it now.
>
>
Anything would be good - I think it's the achilles  heel of postgres.
Perhaps there is something simple like that could fix 95% of the problem.

>I think any optimizations we come up with, will be more happily accepted
>with a working patch that causes no breakage... :-)
>
>
>
I am sure they would.

Cheers
Ralph


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: When to do a vacuum for highly active table
Next
From: Michael Stone
Date:
Subject: Re: RAID Configuration Sugestion