Re: How to cope with low disk space - Mailing list pgsql-general

From Michiel Holtkamp
Subject Re: How to cope with low disk space
Date
Msg-id 47B46D8F.5030707@soundintel.com
Whole thread Raw
In response to Re: How to cope with low disk space  (Bill Moran <wmoran@potentialtech.com>)
Responses Re: How to cope with low disk space
Re: How to cope with low disk space
List pgsql-general
First off, thank you all for such a quick response. I will reply on
several emails at the same time, because the answers overlap.

Bill Moran wrote:
> In response to Michiel Holtkamp <michiel.holtkamp@soundintel.com>:
>
> Since you don't give any idea how much data is involved, let me iterate
> through your choices, given the unknowns:
>
> 1) Buy more disk space.  It's cheap.  Get an external SCSI unit and a
>    SCSI PCI card.

Also suggested by Vincent. Normally this would be an option, but it
doesn't really solve the problem. I didn't explain this properly enough,
but the low disk space problem is an exception to the rule. Normally we
have (more than) enough disk space to store the information, but
sometimes (due to hardware problems or on purpose) the database is
filled with much more data.

To give you an idea of the figures we are talking about: Say we have a
250 GB disk. Normally we would use about 4-8 GB of database. Sometimes
we want to have more information for research, so we lower the threshold
that trigger recording. Because we don't know in advance how much data
this will give us, the disk can run low on disk space. If it does run
out of disk space, that doesn't really matter, because we will not use
that data anyway (we will use only a small portion), as long as it
doesn't completely run out of disk space.

Running low on disk space can also happen when something goes wrong with
the recording trigger (this can be a hardware fault).

In both cases, we don't want to store even more data, but we will want
to make a selection based on age (newer data is more important).

Hopefully, I've explained this better now :)

> 2) Work on your vacuum schedule.  You don't need to vacuum full all the
>    time, and regular vacuum doesn't lock tables.  If you do regular
>    vacuum often enough, you won't see significant bloat.

Ok, this is good. Regular vacuum doesn't lock tables, as you've said and
this is what we want. However, vacuuming sometimes isn't enough. I've
explained in the previous point that my problem occurs as an exception.
Vacuuming reclaims space to be used again, but I want to detect when to
delete _extra_ data (that vacuum then can reclaim) so that we don't run
out of disk space.

> 3) Reduce the data size.  You say this is audio data, can you reduce
>    the bitrate or other storage factors so the data streams aren't so
>    huge?

This is not an option. I won't bore you with the details :)

> In general, if you're hitting the limits of your physical storage on
> a regular basis, you either underestimated your storage requirements
> (which means you should get more storage) or you're storing too much.

agreed.

> The _business_need_ needs to dictate how often you purge old data.
> If you don't have enough hardware to meet the business need, you need
> to add hardware.
>
> If the business need is to store X gigabytes with no regard for how
> old the data is, then you need to adjust your data storage methods
> to work with that.  Create a table to store the size of each LO, and
> run a regular maintenance job that purges old data when the used
> size gets too big.

This we could do, but this looks like we are storing internally how much
space we actually use (instead of the total space reserved). I was
hoping there was a way to ask how much space is reserved, but not claimed.

Peter Childs [reformatted] wrote:
 > I think you need to know depending on a mix of free disk space and
 > free space map usage. If you do a standard Vacuum Verbose it will tell
 > you how full the fsm is. You need to ensure that you have enough free
 > disk space and or a (relativly) full fsm. When the fsm is empty the
 > database has to use disk space,

Maybe this is what I'm looking for. Is there any other way to retrieve
information on the fsm usage?

Thanks again for all your inputs.
Michiel



pgsql-general by date:

Previous
From: Benjamin Arai
Date:
Subject: PostgreSQLDirect versus Npgsql
Next
From: Balázs Klein
Date:
Subject: Re: dynamic crosstab