Re: vaccuming very large table problem - Mailing list pgsql-admin

From if
Subject Re: vaccuming very large table problem
Date
Msg-id f37dba8b0802212357r8c311eby1aa780d42eb2ff1a@mail.gmail.com
Whole thread Raw
In response to Re: vaccuming very large table problem  (Decibel! <decibel@decibel.org>)
List pgsql-admin
On Fri, Feb 22, 2008 at 2:30 AM, Decibel! <decibel@decibel.org> wrote:
>
> On Feb 15, 2008, at 4:56 AM, if wrote:
>  > We use postgresql as a backend to our email gateway, and keep al
>  > emails for in database. Using postgres version 7.4.8 (yes, i know it's
>  > old), and rather specific table schema (the application was desined
>  > that way) -- all emails split into 2kb parts and fed up into
>  > pg_largeobject. So, long story short, i now have a catch-22 situation
>  > -- database using about 0.7TB and we are running out of space ;-)
>  > I can delete some old stuff but i cannot run full vacuum to reclaim
>  > disk space (i takes way more than full weekend) and i also cannot
>  > dump/restore as there's no free space (2x database)
>  >
>  > So, with this restrictions aplied, i figured out that i can somehow
>  > zero out all old entries in pg_largeobject or even physically delete
>  > these files, and rebuild all neccesary indexes.
>  >
>  > What is the best way to do this?
>  > IMO, dd'ing /dev/zero to this files will cause postgres to
>  > reinitialize these empty blocks, and after this will still need to
>  > vacuum full over 0.7TB, am i right?
>  > And if i delete them, then start postmaster, there'll be lots of
>  > complaining but will the latest data be saved?
>  >
>  > How can i delete, for instance, first 70% of data reasonably fast?
>
>  You're still inserting new email, right? If so, why are you worried
>  about reclaiming space? Just delete some stuff, let vacuum clean it
>  up, and make sure that your FSM is big enough (easiest way to do that
>  is to run vacuumdb -av).
>
>  You'll also want to periodically reindex, especially in 7.4.
>
>  And yes, upgrade. At a minimum you need to get to the lastest 7.4,
>  which doesn't require anything special.
>  --
>  Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
>  Give your computer some brain candy! www.distributed.net Team #1828
>
>
>

Thank you for answering!

The catch-22 was that we still had about 20G of free space (it's 4-5
days to "X" hour left, speaking in time ), and vacuuming the whole
pg_largeobject would take more than 5 days even under no other load,
and we have plenty of load on worktime, so we couldn't do this anyway.
Deleting stuff was inefficient too, it's just _too_ slow. I still do
not understand schema fully, but there are lots of constraints, and
simple "delete message" action was in fact rescanning all the
database. The index itself on pg_largeobject was using more than 8
gigs of space, so it was a one, big, unbelievable real mess ;-)

Also, this is vendor patched postgres instance, and improve something
without access to the source is way beyond my skills.

Here's how i solved this, if someone interested:
oid2name helped to locate files on disk. i simply got binary from
debian postgres-7.4 package.
i did a quick select and found which files i can simply remove (i.e
relation pg_largeobject resides in 1G files $pgbase/base/16404.x,
where x between 1 and 570), so i just deleted first 400 or so
(previously checked up that there's only old data there), renamed ".x"
part accordingly and started postmaster. Then some
reindexing/vacuuming, and voila ;-)

And yes, upgrading is class "B" priority now. Comparing 8.3 to 7.4.8
is like comparing Postgres to Mysql, imo ;-)

regards,
if

pgsql-admin by date:

Previous
From: "Lucian DRAGUS"
Date:
Subject: unsubscribe
Next
From: Tom Lane
Date:
Subject: Re: PL/pgSQL memory consumption?