Re: 7.3.1 takes long time to vacuum table? - Mailing list pgsql-general

From Mark Cave-Ayland
Subject Re: 7.3.1 takes long time to vacuum table?
Date
Msg-id C1379626F9C09A4C821D6977AA6A54570632A2@webbased8.wb8.webbased.co.uk
Whole thread Raw
In response to 7.3.1 takes long time to vacuum table?  ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>)
Responses Re: 7.3.1 takes long time to vacuum table?  ("Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>)
List pgsql-general
Hi Shridhar,

> -----Original Message-----
> From: Shridhar Daithankar<shridhar_daithankar@persistent.co.in>
> [mailto:shridhar_daithankar@persistent.co.in]
> Sent: 20 February 2003 06:32
> To: PostgreSQL General
> Subject: Re: [GENERAL] 7.3.1 takes long time to vacuum table?
>
> On Wednesday 19 Feb 2003 9:05 pm, you wrote:
> > Interestingly this could be used to create a speedy vacuum - that
is,
> > create a new table with a temporary name that is invisible to the
> > database (similar to dropped columns), then taking into account the
disk
> > space left on the device, pick the last X pages from the old table
and
> > write to the new table. Then truncate the file containing the table
at
> > point X and repeat until finished. Finally kill the old table and
make
> > the new one visible. I appreciate there may be a couple of issues
with
> > oids/foreign keys but it sounds like a great solution to me! Why
does
> > vacuum bother with reordering rows? I thought that was what the
CLUSTER
> > command was for? Any developers care to comment on this? I imagine
there
> > must be a technical reason (prob to do with MVCC) as to why this
hasn't
> > been done before?
>
> Well, One thing I can think of is the extra space required. The algo.
> looks
> good but it would be very difficult to make sure that it works all the
> time
> especially given that postgresql does not have sophisticated and/or
> tunable
> storage handling( think of tablespaces ).

In some ways, extra space isn't a problem as long as you know about it.
On our dev system, we've had several occasions where the disk has filled
and we've had to get in and recover it. When the system was designed, it
was planned to have a couple of 10s of GB spare to store additional
data, but we did not plan to need to have a second copy of our largest
table @ 40Gb a copy! So given that postgres falls over anyway when the
disk is full, I would not see this as a reason to NOT develop additional
functionality which would make use of more disk space as long as users
can be made aware of this need....

> It is always space-time trade-off. On one hand we have vacuum which
uses a
> constant and may be negiliible space but takes time proportional to
amount
> of
> work. On other hand we have drop/recreate table which takes double the
> space
> but is extremely fast i.e. proportinal to data size at max. I/O
bandwidth
> available..
>
> It would be good if there is in between. Of course it would not be
easy to
> do
> it. But it has to start, isn't it?..:-)

The situation here is that to do a vacuum full requires locking this
particular table so our system becomes practically unusable anyway. So
having the process take a day as opposed to 3-4 days has been a great
benefit to us.

> I recommend this strategy of "vacuuming" be documented in standard
> documentation and FAQ. Given that postgresql is routinely deployed for
> databases >10GB which is greater than small/medium by any definition
> today, I
> think this will be a good move.

Yes, that would be very useful if it could documented in which
situations a SELECT INTO would be dramatically more efficient than a
vacuum.

> Furthermore this strategy reduces the down time due to vacuum full
locks
> drastically. I would say it is worth buying a 80GB IDE disk for this
> purpose
> if you have this big database..
>
> Nice to see that my idea helped somebody..:-)

:) Well, thank YOU very much! As I stated in the previous email, I hope
this information goes some way to showing that the facility should be
considered more important as people move towards larger postgresql
databases.


Cheers,

Mark.

P.S. Have just received your other email while writing this, and the
procedure you described is pretty close to what we're doing. However, we
need to manually add back various constraints/default values into the
table columns which is a bit of a pain... wish it could be a little more
automatic.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

pgsql-general by date:

Previous
From: "Shridhar Daithankar"
Date:
Subject: Re: 7.3.1 takes long time to vacuum table?
Next
From: "Philippe Kiener"
Date:
Subject: Changing text encoding rapidly