Pre-allocation of space: a business rationale - Mailing list pgsql-admin

From Bath, David
Subject Pre-allocation of space: a business rationale
Date
Msg-id 200511031427.10391.dave.bath@unix.net
Whole thread Raw
In response to Re: Vacuum Verbose output  (Robert Treat <xzilla@users.sourceforge.net>)
Responses Re: Pre-allocation of space: a business rationale
List pgsql-admin
There has been a discussion about allowing pre-allocation of space,
and want to highlight the issues from a business perspective, rather
than those affectings DBAs and OS admins.

On Thu, 03 Nov 2005 06:00, Robert Treat wrote:
> On Monday 31 October 2005 22:59, Tom Lane wrote:
> > Scott Marlowe <smarlowe@g2switchworks.com> writes:
> > > On Mon, 2005-10-31 at 16:34, Tomeh, Husam wrote:
> > >> Pre-allocating space will prevent extending the datafile during
> > >> loading massive data (batch processing) and would improve the overall
> > >> batch write performance.

My real concern is not performance, but the impact on business continuity
in the following scenario (although I can imagine others of similar import).
(Apologies if there is a well-known way of addressing this issue).

A) I have a tablespace holding data for important business processes.
   (Lets term this "24x7data")
B) I allow a user to have their own data, perhaps some non-critical summaries
   and/or scratchpad/development work.  (Lets term this "luser data")
C) I want to avoid the possibility of uncontrolled growth of luser data
   blowing disk leading to stoppage of 24x7 data.

As I see it, unless I am able to partition them on different disks, which
might be difficult if I've created one logical disk across a cluster, it
is possible that uncontrolled growth of luser data (they might have an
endless loop in buggy code, or just attempt to pig out on data space)
might chew up all available storage, leading to 24x7 processes crashing
when they need to insert more data.

This could happen even if luser and 24x7 stuff live in different databases.

One theoretical way of avoiding this is to allow a "maximum size" argument
to the CREATE TABLESPACE statement, with a preference for preallocation.
This is probably easier to implement than a quota system on either an
object-owner (e.g. CREATE USER) or object  (e.g. CREATE TABLE/INDEX)
basis.

I can't see such parameters for v8.0 statements of these types.

While it might be fine to monitor disk usage, even if you read reports
every day and can get the $ for a new disk quickly, a lazy programmer
or piggy user can blow away your disk in a blink of an eye.

(Aside: I once needed to get error reports from telecoms bearers into
postgresql's uncle (ingres5) and could get about 10,000 errors a
second coming through msgs and sems.  Even without an endless loop bug,
and with creation of an expandable circular buffer for non-volatile
caching, I STILL had to worry about crashing the telecoms monitoring
processes if I pigged out on disk.)

Note that I'm not suggesting the cloning of any particular Oracle feature,
I do believe it is important to have a well-known way of addressing
such scenarios.

--
David T. Bath
dave.bath@unix.net


pgsql-admin by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: how do you automate database backups?
Next
From: Tom Lane
Date:
Subject: Re: Pre-allocation of space: a business rationale