Re: 7.3.1 takes long time to vacuum table? - Mailing list pgsql-general

From Mark Cave-Ayland
Subject Re: 7.3.1 takes long time to vacuum table?
Date
Msg-id C1379626F9C09A4C821D6977AA6A545706329F@webbased8.wb8.webbased.co.uk
Whole thread Raw
In response to 7.3.1 takes long time to vacuum table?  ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>)
Responses Re: 7.3.1 takes long time to vacuum table?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: 7.3.1 takes long time to vacuum table?  ("Shridhar Daithankar<shridhar_daithankar@persistent.co.in>" <shridhar_daithankar@persistent.co.in>)
List pgsql-general
Hi Martijn, Shridhar,

> > Agreed! If it ruins any caching then in my view it's something that
has
> > to change in order to keep performance. While there may be a penalty
to
> > pay on smaller tables, the benefits of caching would more than make
up
> > for the cost of going forwards - imagine how slow CPUs would be if
> > everything was a cache miss....
>
> I'd like to get one of the developers views on this.

Just looking at Shridhar's email, and Martin's earlier comment about
SELECT * INTO newtable FROM oldtable, I'm rapidly seeing that this would
be the better way go. Because instead of seeking around millions of
pages with the vacuum, it would be less work to do this because then it
is just a sequential read and a sequential write.

Interestingly this could be used to create a speedy vacuum - that is,
create a new table with a temporary name that is invisible to the
database (similar to dropped columns), then taking into account the disk
space left on the device, pick the last X pages from the old table and
write to the new table. Then truncate the file containing the table at
point X and repeat until finished. Finally kill the old table and make
the new one visible. I appreciate there may be a couple of issues with
oids/foreign keys but it sounds like a great solution to me! Why does
vacuum bother with reordering rows? I thought that was what the CLUSTER
command was for? Any developers care to comment on this? I imagine there
must be a technical reason (prob to do with MVCC) as to why this hasn't
been done before?

In fact, my colleague has just done a test with SELECT..INTO on our dev
version and it compacted 600Mb -> 400Mb in just 40s(!). We then did a
vacuum full on the same original 600Mb table which is still going after
20mins. Difficult choice! So even in a worse case scenario we could have
a fully vacuumed table within a day.... we're looking at dropping some
indexes in the db to reclaim enough space to be able to fit another copy
of the table on the disk... this is looking very tempting at the
moment....


Cheers,

Mark.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

pgsql-general by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: SQL query...
Next
From: "Robert Echlin"
Date:
Subject: Re: TIMESTAMP WITH( OUT)? TIME ZONE indexing/type choice...