Re: 7.3.1 takes long time to vacuum table? - Mailing list pgsql-general

From Mark Cave-Ayland
Subject Re: 7.3.1 takes long time to vacuum table?
Date
Msg-id C1379626F9C09A4C821D6977AA6A54570632A5@webbased8.wb8.webbased.co.uk
Whole thread Raw
In response to 7.3.1 takes long time to vacuum table?  ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>)
List pgsql-general
Yes! That sounds like a good idea that could be used to implement 'live'
table vacuuming.....

Would it be possible to partition each postgres table across 2 files?
You define a stripe size, say 1000, which is the number of pages that
are written in turn to each file. I guess this would be chosen so as to
not penalise sequential table scans too much. So the first 1000 pages
would get written to file A, the next 1000 to file B, the next 1000 to
file A again.... and so on.

Any queries against the table must search both files to return the rows
required. The immediate downside is that I guess indexes would have to
be updated so they were aware that data was stored across two different
files.... but let's carry on for a moment....

So now if you want to vacuum the whole table, you first lock file A and
begin vacuuming it using an appropriate method. While this is running,
if someone tries to delete a row from file A, the row is simply marked
as deleted and moved to the end of the file during the vacuum. A
deletion from file B is simply marked as deleted as normal. If someone
does an insert or an update on any row in the table then the changed
rows are written to file B while file A is locked and vice-versa.

Assuming that each file contains roughly the same number of rows then
any new updates/inserts to the table should be distributed evenly across
both files since the vacuum time of each file should be roughly equal.
(If this is not the case then it may be necessary to perform some form
of 'balancing' by taking some rows from the larger file and moving them
to the smaller one... I guess some details are missing here.).

Another downside of this would be that the last < 1000 pages of each
file wouldn't be vacuumed - but would people really mind if a table
wasn't fully vacuumed? I don't think they would. However, while there
are probably several flaws in the example I gave above, I think
Jean-Luc's idea of relating partitioning to this thread could be used to
eliminate many of the problems of vacuuming that currently exist....


Food for thought,


Mark.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

> -----Original Message-----
> From: Jean-Luc Lachance [mailto:jllachan@nsd.ca]
> Sent: 20 February 2003 16:43
> To: Shridhar Daithankar<shridhar_daithankar@persistent.co.in>
> Cc: PostgreSQL General
> Subject: Re: [GENERAL] 7.3.1 takes long time to vacuum table?
>
> Well, here is another case where partitioning would be usefull.
>
> Lets all agree the that vaccuming a small table should be better done
by
> copying to a new one.
> Now, if a larger table would be partitioned, it would allow vacuuming
> one partition at a time.
>
>
> JLL
>
> P.S. Is there really a need to reorder the vaccumed table???


pgsql-general by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: what kind of upgrade path is this
Next
From: Thomas Beutin
Date:
Subject: Re: max connections