Re: [HACKERS] vacuum slowness - Mailing list pgsql-hackers

From Wayne Piekarski
Subject Re: [HACKERS] vacuum slowness
Date
Msg-id 199903180446.PAA06333@gateway.senet.com.au
Whole thread Raw
Responses Re: [HACKERS] vacuum slowness
List pgsql-hackers
Tom Lane writes:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > I just deleted all 50,000 rows from a table that has one int4 and one
text
> > field.
>
> > Why does vacuum take so long?  If all the rows are superceeded, so no
> > rows actually have to be moved, should it take so long for vacuum to
> > run?
>
> Do you have any indexes on the table?  I've noticed (and complained in
> the past ;-)) that vacuuming a table takes unreasonably long if there
> are a lot of dead index entries to be cleaned.  It seems faster to drop
> and recreate the index in a case like that.

Hi everyone,

I am working on a large project right now which involves the use of a
table that has thousands of inserts and updates performed each day. (At
the end of the day, about 20000 inserts have occured, and each inserted
row gets modified 2 or 3 times) Vacuums take absolutely ages and
unfortunately the system must run continuously 24 hours per day so I can't
afford to have the table locked for ages while it is being vacuumed.

I've played around with vacuum quite a bit, and I've found that if I do
one huge vacuum every so often, it takes longer than if I do lots of
vacuum's during the day, this way the tables are kept more 'compacted' and
there is less moving around of data required, and so it runs a bit faster.

As the number of days of new data stored increases, the size of the tables
grows to the point where a vacuum can take 10 minutes or so, and this is
unacceptable considering it occurs in a few seconds without indexes. To
get around this, once every day, I grab entries which are in the active
table that are older than two days, and move them into an archive table
which never changes. This way, I can keep the active table small and do
vacuums within a minute or so, allowing me to keep my software from
waiting too long. I'd really like to avoid doing this though, because it
causes complications - lately I've found that vacuuming is becoming a
major hassle which I'd rather not have to do at all :)

What I was wanting to know if there was a way of temporarily disabling
indexes while the vacuum is occuring, and then update it all in one hit
once the update is completely finished. This would be equivalent to
dropping and recreating them, but I don't want to do that in case
something dies during the vacuum and my tables are left without indexes on
them.

Or perhaps telling Postgres to do a partial vacuum, with a time limit set
to say 20 seconds and it will do it in stages over the period of a day.
This way the database can still run and we can keep the dbms cleaned. From
what I understand, the new MVCC support in 6.5 will be able to do vacuum's
in the background, or is this for the future?

Also, I had a look at the src/commands/vacuum.c code, and had a bit of a
read through it. One thing I wasn't sure about is the method it uses to
move the rows around while it is doing the index. Lets say that we have
100 rows, and the first one is deleted and so is empty. Does every single
row get moved back one, or does only one row get moved to fill in the
empty gap?

Is the vacuum code moving tons of rows around the table, causing the
indexes to be updated lots of times and slowing things down?


If someone could give me some hints about how to best handle my tables to
get good vacuum times I would really appreciate it.


btw, keep up the good work everyone, I've been following this mailing list
and developing with Postgres since the days of pre-6.0 and I'm very
impressed with all the great improvements that have been made to Postgres
over the years!

Thanks,
Wayne

------------------------------------------------------------------------------
Wayne Piekarski                               Tel:     (08) 8221 5221
Research & Development Manager                Fax:     (08) 8221 5220
SE Network Access Pty Ltd                     Mob:     0407 395 889
222 Grote Street                              Email:   wayne@senet.com.au
Adelaide SA 5000                              WWW:     http://www.senet.com.au


pgsql-hackers by date:

Previous
From: "D'Arcy" "J.M." Cain
Date:
Subject: Re: [HACKERS] Sequences....
Next
From: Vadim Mikheev
Date:
Subject: Re: [HACKERS] Re: vacuum slowness