Re: Massive delete from a live production DB - Mailing list pgsql-general

From Bill Moran
Subject Re: Massive delete from a live production DB
Date
Msg-id 20110512112423.8d132b1a.wmoran@potentialtech.com
Whole thread Raw
In response to Massive delete from a live production DB  (Phoenix Kiula <phoenix.kiula@gmail.com>)
List pgsql-general
In response to Phoenix Kiula <phoenix.kiula@gmail.com>:

> Hi
>
> Been reading some old threads (pre 9.x version) and it seems that the
> consensus is to avoid doing massive deletes from a table as it'll
> create so much unrecoverable space/gaps that vacuum full would be
> needed. Etc.
>
> Instead, we might as well do a dump/restore. Faster, cleaner.
>
> This is all well and good, but what about a situation where the
> database is in production and cannot be brought down for this
> operation or even a cluster?
>
> Any ideas on what I could do without losing all the live updates? I
> need to get rid of about 11% of a 150 million rows of database, with
> each row being nearly 1 to 5 KB in size...

Have you considered the following process:

1) SELECT the rows you want to keep into a new table (time-consuming)
2) Start outage
3) Pull over any new rows that might have been added between 1 & 2
4) Drop the old table
5) Rename the new table to the old name
6) Any other steps required to make the new table exactly like
   the old one (i.e. foreign keys, serials, etc)
7) End outage window

Because steps 3 - 6 are very fast, your outage window is very short.
Not a perfect, 0 downtime solution, but possibly helpful.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

pgsql-general by date:

Previous
From: Eric Ndengang
Date:
Subject: Re: Massive delete from a live production DB
Next
From: Marti Raudsepp
Date:
Subject: Re: Massive delete from a live production DB