Thread: vaccuming very large table problem

vaccuming very large table problem

From
if
Date:
Hello list!

We use postgresql as a backend to our email gateway, and keep al
emails for in database. Using postgres version 7.4.8 (yes, i know it's
old), and rather specific table schema (the application was desined
that way) -- all emails split into 2kb parts and fed up into
pg_largeobject. So, long story short, i now have a catch-22 situation
-- database using about 0.7TB and we are running out of space ;-)
I can delete some old stuff but i cannot run full vacuum to reclaim
disk space (i takes way more than full weekend) and i also cannot
dump/restore as there's no free space (2x database)

So, with this restrictions aplied, i figured out that i can somehow
zero out all old entries in pg_largeobject or even physically delete
these files, and rebuild all neccesary indexes.

What is the best way to do this?
IMO, dd'ing /dev/zero to this files will cause postgres to
reinitialize these empty blocks, and after this will still need to
vacuum full over 0.7TB, am i right?
And if i delete them, then start postmaster, there'll be lots of
complaining but will the latest data be saved?

How can i delete, for instance, first 70% of data reasonably fast?

P.S.  Please cc me, as i'm not subscribed yet.
Thanks in advance!

regards,
if

Re: vaccuming very large table problem

From
Decibel!
Date:
On Feb 15, 2008, at 4:56 AM, if wrote:
> We use postgresql as a backend to our email gateway, and keep al
> emails for in database. Using postgres version 7.4.8 (yes, i know it's
> old), and rather specific table schema (the application was desined
> that way) -- all emails split into 2kb parts and fed up into
> pg_largeobject. So, long story short, i now have a catch-22 situation
> -- database using about 0.7TB and we are running out of space ;-)
> I can delete some old stuff but i cannot run full vacuum to reclaim
> disk space (i takes way more than full weekend) and i also cannot
> dump/restore as there's no free space (2x database)
>
> So, with this restrictions aplied, i figured out that i can somehow
> zero out all old entries in pg_largeobject or even physically delete
> these files, and rebuild all neccesary indexes.
>
> What is the best way to do this?
> IMO, dd'ing /dev/zero to this files will cause postgres to
> reinitialize these empty blocks, and after this will still need to
> vacuum full over 0.7TB, am i right?
> And if i delete them, then start postmaster, there'll be lots of
> complaining but will the latest data be saved?
>
> How can i delete, for instance, first 70% of data reasonably fast?

You're still inserting new email, right? If so, why are you worried
about reclaiming space? Just delete some stuff, let vacuum clean it
up, and make sure that your FSM is big enough (easiest way to do that
is to run vacuumdb -av).

You'll also want to periodically reindex, especially in 7.4.

And yes, upgrade. At a minimum you need to get to the lastest 7.4,
which doesn't require anything special.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

Re: vaccuming very large table problem

From
if
Date:
On Fri, Feb 22, 2008 at 2:30 AM, Decibel! <decibel@decibel.org> wrote:
>
> On Feb 15, 2008, at 4:56 AM, if wrote:
>  > We use postgresql as a backend to our email gateway, and keep al
>  > emails for in database. Using postgres version 7.4.8 (yes, i know it's
>  > old), and rather specific table schema (the application was desined
>  > that way) -- all emails split into 2kb parts and fed up into
>  > pg_largeobject. So, long story short, i now have a catch-22 situation
>  > -- database using about 0.7TB and we are running out of space ;-)
>  > I can delete some old stuff but i cannot run full vacuum to reclaim
>  > disk space (i takes way more than full weekend) and i also cannot
>  > dump/restore as there's no free space (2x database)
>  >
>  > So, with this restrictions aplied, i figured out that i can somehow
>  > zero out all old entries in pg_largeobject or even physically delete
>  > these files, and rebuild all neccesary indexes.
>  >
>  > What is the best way to do this?
>  > IMO, dd'ing /dev/zero to this files will cause postgres to
>  > reinitialize these empty blocks, and after this will still need to
>  > vacuum full over 0.7TB, am i right?
>  > And if i delete them, then start postmaster, there'll be lots of
>  > complaining but will the latest data be saved?
>  >
>  > How can i delete, for instance, first 70% of data reasonably fast?
>
>  You're still inserting new email, right? If so, why are you worried
>  about reclaiming space? Just delete some stuff, let vacuum clean it
>  up, and make sure that your FSM is big enough (easiest way to do that
>  is to run vacuumdb -av).
>
>  You'll also want to periodically reindex, especially in 7.4.
>
>  And yes, upgrade. At a minimum you need to get to the lastest 7.4,
>  which doesn't require anything special.
>  --
>  Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
>  Give your computer some brain candy! www.distributed.net Team #1828
>
>
>

Thank you for answering!

The catch-22 was that we still had about 20G of free space (it's 4-5
days to "X" hour left, speaking in time ), and vacuuming the whole
pg_largeobject would take more than 5 days even under no other load,
and we have plenty of load on worktime, so we couldn't do this anyway.
Deleting stuff was inefficient too, it's just _too_ slow. I still do
not understand schema fully, but there are lots of constraints, and
simple "delete message" action was in fact rescanning all the
database. The index itself on pg_largeobject was using more than 8
gigs of space, so it was a one, big, unbelievable real mess ;-)

Also, this is vendor patched postgres instance, and improve something
without access to the source is way beyond my skills.

Here's how i solved this, if someone interested:
oid2name helped to locate files on disk. i simply got binary from
debian postgres-7.4 package.
i did a quick select and found which files i can simply remove (i.e
relation pg_largeobject resides in 1G files $pgbase/base/16404.x,
where x between 1 and 570), so i just deleted first 400 or so
(previously checked up that there's only old data there), renamed ".x"
part accordingly and started postmaster. Then some
reindexing/vacuuming, and voila ;-)

And yes, upgrading is class "B" priority now. Comparing 8.3 to 7.4.8
is like comparing Postgres to Mysql, imo ;-)

regards,
if