Thread: autovacuum for large periodic deletes

autovacuum for large periodic deletes

From
"Sriram Dandapani"
Date:

Hi

 

Every night, a delete of about 50 million rows occurs on a table. Do we need to explicitly vacuum the table or will autovacuum handle the large deletes without affecting performance. I am trying to determine whether or not I should completely rely on autovacuum or also tinker with manual vacuums.

 

Sriram

Re: autovacuum for large periodic deletes

From
Alvaro Herrera
Date:
Sriram Dandapani wrote:

> Every night, a delete of about 50 million rows occurs on a table. Do we
> need to explicitly vacuum the table or will autovacuum handle the large
> deletes without affecting performance. I am trying to determine whether
> or not I should completely rely on autovacuum or also tinker with manual
> vacuums.

How many rows stay in the table after the massive deletion?  It may be
that it's best for you to issue a CLUSTER or VACUUM FULL instead of
VACUUM.  (In any case, autovacuum would vacuum the table eventually, but
maybe you can do better.)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: autovacuum for large periodic deletes

From
"Sriram Dandapani"
Date:
About 5-10 million rows stay after deletion. There are a few other
tables where the daily deletion totals about 3-6 million.

Would a vacuum full/cluster affect other operations. These tables have a
24x7 high data insertion rate.

-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
Sent: Wednesday, May 17, 2006 1:36 PM
To: Sriram Dandapani
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] autovacuum for large periodic deletes

Sriram Dandapani wrote:

> Every night, a delete of about 50 million rows occurs on a table. Do
we
> need to explicitly vacuum the table or will autovacuum handle the
large
> deletes without affecting performance. I am trying to determine
whether
> or not I should completely rely on autovacuum or also tinker with
manual
> vacuums.

How many rows stay in the table after the massive deletion?  It may be
that it's best for you to issue a CLUSTER or VACUUM FULL instead of
VACUUM.  (In any case, autovacuum would vacuum the table eventually, but
maybe you can do better.)

--
Alvaro Herrera
http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: autovacuum for large periodic deletes

From
Chris Browne
Date:
sdandapani@counterpane.com ("Sriram Dandapani") writes:
> Every night, a delete of about 50 million rows occurs on a table. Do
> we need to explicitly vacuum the table or will autovacuum handle the
> large deletes without affecting performance. I am trying to
> determine whether or not I should completely rely on autovacuum or
> also tinker with manual vacuums.:p>

I'd be inclined to submit a vacuum on the table as part of the
process.

Moreover, I'd be inclined to consider this situation to perhaps be
something of a bug.  Deleting enormous amounts of data from a table,
on a regular basis, is not something I would expect would work out
well.
--
(reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc"))
http://cbbrowne.com/info/postgresql.html
We  are MICROS~1.   You will  be assimilated.   Resistance  is futile.
(Attributed to B.G., Gill Bates)

Re: autovacuum for large periodic deletes

From
"Jim C. Nasby"
Date:
On Wed, May 17, 2006 at 01:39:35PM -0700, Sriram Dandapani wrote:
> About 5-10 million rows stay after deletion. There are a few other
> tables where the daily deletion totals about 3-6 million.

The default autovac settings will only vacuum a table after 40% of the
rows are dead. That seems pretty high to me, so I regularly recommend
cutting all the thresholds and scale factors in half.

> Would a vacuum full/cluster affect other operations. These tables have a
> 24x7 high data insertion rate.

VACUUM FULL and CLUSTER both aquire exclusive locks.

BTW, it sounds like table partitioning might be useful for you. It would
allow you to setup a partition for each day, and then drop one day's
worth of data very quickly.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: autovacuum for large periodic deletes

From
Chris Browne
Date:
sdandapani@counterpane.com ("Sriram Dandapani") writes:
> About 5-10 million rows stay after deletion. There are a few other
> tables where the daily deletion totals about 3-6 million.

It would appear there is something fairly not-sane about the process,
then.  You delete about 90% of the day's data from the table each day?
That's a *lot*, and you are quite likely to have trouble with this
table blowing out the Free Space Map as a result.

> Would a vacuum full/cluster affect other operations. These tables
> have a 24x7 high data insertion rate.

Yes, VACUUM FULL and CLUSTER would block other operations while they
run.

The real Right Answer probably involves having data flow into some
sort of "queue" table, created fresh each day, for that day's
activities, where, at the end of the day, all of the data either gets
purged or moved to the "final destination" table, so that a new table
can be created, the next day.
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/unix.html
CBS News report on Fort Worth tornado damage:
"Eight major downtown buildings were severely damaged and 1,000 homes
were damaged, with 95 uninhabitable.  Gov. George W. Bush declared
Tarrant County a disaster area.  Federal Emergency Management Agency
workers are expected to arrive sometime next week after required
paperwork is completed."

Re: autovacuum for large periodic deletes

From
"Sriram Dandapani"
Date:
Looks like partitioning may seem the way to go. I have been hesitant
about using partitioning as the feature is very new. How do I manage the
free space map if I do not use partitioning?

Thanks for your inputs

Sriram

-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Chris Browne
Sent: Wednesday, May 17, 2006 2:48 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] autovacuum for large periodic deletes

sdandapani@counterpane.com ("Sriram Dandapani") writes:
> About 5-10 million rows stay after deletion. There are a few other
> tables where the daily deletion totals about 3-6 million.

It would appear there is something fairly not-sane about the process,
then.  You delete about 90% of the day's data from the table each day?
That's a *lot*, and you are quite likely to have trouble with this
table blowing out the Free Space Map as a result.

> Would a vacuum full/cluster affect other operations. These tables
> have a 24x7 high data insertion rate.

Yes, VACUUM FULL and CLUSTER would block other operations while they
run.

The real Right Answer probably involves having data flow into some
sort of "queue" table, created fresh each day, for that day's
activities, where, at the end of the day, all of the data either gets
purged or moved to the "final destination" table, so that a new table
can be created, the next day.
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/unix.html
CBS News report on Fort Worth tornado damage:
"Eight major downtown buildings were severely damaged and 1,000 homes
were damaged, with 95 uninhabitable.  Gov. George W. Bush declared
Tarrant County a disaster area.  Federal Emergency Management Agency
workers are expected to arrive sometime next week after required
paperwork is completed."

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: autovacuum for large periodic deletes

From
"Jim C. Nasby"
Date:
On Wed, May 17, 2006 at 03:18:45PM -0700, Sriram Dandapani wrote:
> Looks like partitioning may seem the way to go. I have been hesitant
> about using partitioning as the feature is very new. How do I manage the
> free space map if I do not use partitioning?

Make it very large. Or read the detailed description at
http://lnk.nu/pervasivepostgres.com/9hg.aspx
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461