Thread: Table bloat and vacuum

Table bloat and vacuum

From
Jack Orenstein
Date:
My application is running on 7.4. We have one huge table that drives
our application, and also a tiny (single-row) table used to maintain
some aggregate information. Each transaction inserts or updates 1-2
rows in the huge table, and updates the tiny table.

We vacuum the entire database once a week, and the tiny table every
2000 transactions.

I'm trying to understand some odd behavior observed regarding the tiny
table: The tiny table's disk file is usually 8K or 16K.  During the
weekly vacuum, the tiny table bloats. It's still one row, but the size
of the file grows. I've seen it get as high as 1M. But then after the
vacuum, it returns to its normal size.

1) Why does the tiny table bloat during a vacuum? Is it because the
scan of the huge table is run as a transaction, forcing maintenance of
dead versions of the tiny table's one row?

2) Why does the bloat resolve itself? We're not doing any full
vacuums.

We're in the process of upgrading to 8.3.4, so I'd appreciate any
throughs on whether and how this behavior will change with the newer
release.

Jack Orenstein

Re: Table bloat and vacuum

From
Alvaro Herrera
Date:
Jack Orenstein wrote:

> 1) Why does the tiny table bloat during a vacuum? Is it because the
> scan of the huge table is run as a transaction, forcing maintenance of
> dead versions of the tiny table's one row?

Yes.

> 2) Why does the bloat resolve itself? We're not doing any full
> vacuums.

Probably the one live tuple bounces to the first page at some point and
then the rest of the pages are truncated by vacuum.

> We're in the process of upgrading to 8.3.4, so I'd appreciate any
> throughs on whether and how this behavior will change with the newer
> release.

In 8.3, vacuuming the big table will not delay dead tuple removal of the
small table.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Table bloat and vacuum

From
"Scott Marlowe"
Date:
On Wed, Nov 12, 2008 at 8:13 AM, Jack Orenstein <jack.orenstein@hds.com> wrote:
> My application is running on 7.4. We have one huge table that drives
SNIP
> We're in the process of upgrading to 8.3.4, so I'd appreciate any
> throughs on whether and how this behavior will change with the newer
> release.

You will not believe how much faster 8.3 is, and how much easier
maintenance is.  You'll be like a kid in a candy store for months
looking at and using all the new features in it.  The improvements are
enormous.  Biggest difference for you is that 8.3 can do vacuums in a
background method (it sleeps x milliseconds between pages), can run 3
or more threads, and autovacuum daemon is on by default.  For the most
part, your vacuuming issues will no longer exist.

Re: Table bloat and vacuum

From
Adriana Alfonzo
Date:
Please, i wan't recive more mails....

Thanks

Scott Marlowe escribió:
> On Wed, Nov 12, 2008 at 8:13 AM, Jack Orenstein <jack.orenstein@hds.com> wrote:
>
>> My application is running on 7.4. We have one huge table that drives
>>
> SNIP
>
>> We're in the process of upgrading to 8.3.4, so I'd appreciate any
>> throughs on whether and how this behavior will change with the newer
>> release.
>>
>
> You will not believe how much faster 8.3 is, and how much easier
> maintenance is.  You'll be like a kid in a candy store for months
> looking at and using all the new features in it.  The improvements are
> enormous.  Biggest difference for you is that 8.3 can do vacuums in a
> background method (it sleeps x milliseconds between pages), can run 3
> or more threads, and autovacuum daemon is on by default.  For the most
> part, your vacuuming issues will no longer exist.
>
>

Aviso Legal  Este mensaje puede contener informacion de interes solo para CVG Venalum. Solo esta permitida su copia,
distribuciono uso a personas autorizadas. Si recibio este corre por error, por favor destruyalo. Eventualmentew los
correoselectonicos pueden ser alterados. Al respecto, CVG Venalum no se hace responsable por los errores que pudieran
afectaral mensaje original. 

Attachment

Re: Table bloat and vacuum

From
Alvaro Herrera
Date:
Adriana Alfonzo escribió:
> Please, i wan't recive more mails....

Por favor visita esta pagina:
http://www.postgresql.org/mailpref/pgsql-general
y desuscribete tu misma de la lista.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Table bloat and vacuum

From
Jack Orenstein
Date:
Scott Marlowe wrote:
> On Wed, Nov 12, 2008 at 8:13 AM, Jack Orenstein <jack.orenstein@hds.com> wrote:
>> My application is running on 7.4. We have one huge table that drives
> SNIP
>> We're in the process of upgrading to 8.3.4, so I'd appreciate any
>> throughs on whether and how this behavior will change with the newer
>> release.
>
> You will not believe how much faster 8.3 is, and how much easier
> maintenance is.  You'll be like a kid in a candy store for months
> looking at and using all the new features in it.  The improvements are
> enormous.  Biggest difference for you is that 8.3 can do vacuums in a
> background method (it sleeps x milliseconds between pages), can run 3
> or more threads, and autovacuum daemon is on by default.  For the most
> part, your vacuuming issues will no longer exist.

Our 7.4 vacuuming strategy has gotten pretty complex:

- Entire database once a week.

- (Eventually) biggest table in database: Vacuumed/analyzed every 1000 updates
until there are 10,000 rows, to ensure that optimizer does the right thing,
(discussed recently on this mailing list).

- Medium-sized table containing single-row concurrency hotspots. Usually less
than 1M rows: vacuumed every 2000 updates.

- Single-row tables - these are permanent hotspots, updated in every
transaction: vacuumed every 2000 updates.

Can you comment on how I'll be able to simplify this vacuum schedule by relying
on autovacuum? Can you point me at a document describing how autovacuum decides
when to vacuum a table?

I've also had some trouble figuring out which VACUUMs should ANALYZE.
Originally, I had every vacuum also run analyze (except for the tiny-table
vacuums). But I ran into the "tuple concurrently updated" problem (see
http://archives.postgresql.org/pgsql-sql/2005-05/msg00148.php), so I've had to
back off from that. Are concurrent analyzes OK in 8.3?

Jack