Thread: Shortening time of vacuum analyze

Shortening time of vacuum analyze

From
Francisco Reyes
Date:
Until 7.2 release is out I am looking for a way to optimize a vacuum
analyze. It currently seems to be taking about 3 hours, but I have some
time constraints and the 3 hours are happening at a time when users may
need the system.

My environment:
Posgresql 7.1.3 with buffers 6000, sort_ment 32K
FreeBSD 4.4 Stable (2 months old) with 1GB ram.
/etc/sysctl set to
kern.ipc.shmall=65535
kern.ipc.shmmax=67117056
kern.ipc.shm_use_phys=1

OS in IDE drive, "/base" directory on 10K rpm SCSI drive, pg_xlog on
second 10K rpm SCSI disk.

Nightly doing delete of about 6 million records and then re-merging.
Previously I was doing truncate, but this was an issue if a user tried to
use the system while we were loading. Now we are having a problem while
the server is running vacuum analyzes.

Does vacuum alone takes less time?
Maybe I could do deletion, load, vacuum in sequence and then do vacuum
analyze after hours.


Re: Shortening time of vacuum analyze

From
Tom Lane
Date:
Francisco Reyes <lists@natserv.com> writes:
> Until 7.2 release is out I am looking for a way to optimize a vacuum
> analyze.

7.2RC2 is going to mutate into 7.2 *real* soon now, probably next week.
My best advice to you is not to wait any longer.

> Nightly doing delete of about 6 million records and then re-merging.
> Previously I was doing truncate, but this was an issue if a user tried to
> use the system while we were loading. Now we are having a problem while
> the server is running vacuum analyzes.

> Does vacuum alone takes less time?

Yes, but with so many deletes I'm sure that it's the space-compaction
part that's killing you.

The only useful workaround I can think of is to create a new table,
fill it with the data you want, then DROP the old table and ALTER RENAME
the new one into place.  However this will not work if there are other
tables with foreign-key references to the big table.  You also have a
problem if you can't shut off updates to the old table while this is
going on.

7.2's lazy VACUUM ought to be perfect for you, though.

            regards, tom lane

Re: Shortening time of vacuum analyze

From
Francisco Reyes
Date:
On Wed, 30 Jan 2002, Andrew Sullivan wrote:

> On Wed, Jan 30, 2002 at 11:07:43AM -0500, Francisco Reyes wrote:
>
> > Nightly doing delete of about 6 million records and then re-merging.
> > Previously I was doing truncate, but this was an issue if a user tried to
> > use the system while we were loading. Now we are having a problem while
> > the server is running vacuum analyzes.
>
> I'm not sure I understand the problem of "issue if a user tried to
> use the system while we were loading".  I understand that, of course,
> the data is gone when you truncate; but won't it be anyway, if you
> delete?

You can put the delete/load inside a transaction so the users will never
see an empty table. Truncate can not be placed inside a transaction.

We basically do
begin transaction
delete
copy
commit transaction


Re: Shortening time of vacuum analyze

From
Andrew Sullivan
Date:
On Wed, Jan 30, 2002 at 12:23:50PM -0500, Francisco Reyes wrote:

> You can put the delete/load inside a transaction so the users will never
> see an empty table. Truncate can not be placed inside a transaction.

True enough.  But why not leave the table unvacuumed, then, until a
more convenient time?  You are, of course, paying a cost in
performance during that time, but not as great as you would with
vacuum.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110


Re: Shortening time of vacuum analyze

From
Francisco Reyes
Date:


On Wed, 30 Jan 2002, Andrew Sullivan wrote:

> On Wed, Jan 30, 2002 at 12:23:50PM -0500, Francisco Reyes wrote:
>
> > You can put the delete/load inside a transaction so the users will never
> > see an empty table. Truncate can not be placed inside a transaction.
>
> True enough.  But why not leave the table unvacuumed, then, until a
> more convenient time?  You are, of course, paying a cost in
> performance during that time, but not as great as you would with
> vacuum.

We do the vacuum right after the load of all the new tables, but before we
do all our daily reports. An operation that takes 30 minutes if all is
vacuumed takes about 2 hours if not vacuumed.

Given that 7.2 is due out shortly I have comed up with some other ways of
improving time.. ie delaying re-freshing some data which rarely ever
changes until 7.2 is out.