Thread: Getting total and free disk space from paths in PGDATA

Getting total and free disk space from paths in PGDATA

From
Michael Paquier
Date:
Hi all,

There is currently no in-core function to query the amount of
available and free space in a path of PGDATA, with something like that
for example:
SELECT * FROM pg_get_diskspace_info('pg_xlog');total_space | free_space
-------------+------------4812 MB     | 3925 MB
(1 row)

This would be definitely useful for monitoring purposes to have a look
at the disk space bloat in PGDATA, pg_xlog, or even pg_log which are
usually located on different partitions. Some of my customers have
requested such a thing for a couple of times, and even if you can
already do it with for example plpython/os.statvfs or plperl by
parsing the output of df, I am getting the feeling that we should have
something in-core not directly relying on an PL language. genfile.c
has also what is needed to restrict the path used, by blocking
absolute paths that are not part of log_directory or PGDATA.

statvfs is part of the POSIX spec and is "normally" present on modern
platforms (BSD, OSX, Linux and Solaris have it as far as I saw, still
there may be some strange platform without it). Windows does not have
it, though we could use GetDiskFreeSpaceEx to retrieve this
information (and actually this is the reason why I am only proposing
to get the available/free space from a path):
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364937%28v=vs.85%29.aspx

Another minor issue is that fsblkcnt_t is an unsigned long, and the
return values should be bigint for both the free and the available
space, so we could have incorrect results once we have values higher
than 2^63 - 1, or disk space values higher than 8.39EB = 8.39e6TB if
your prefer, but it's not like we're at this scale yet :)

Thoughts?
-- 
Michael



Re: Getting total and free disk space from paths in PGDATA

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> statvfs is part of the POSIX spec and is "normally" present on modern
> platforms (BSD, OSX, Linux and Solaris have it as far as I saw, still
> there may be some strange platform without it).

There are considerably less strange platforms that have per-user
disk quotas.  I wonder what statvfs does with those.
        regards, tom lane



Re: Getting total and free disk space from paths in PGDATA

From
Kevin Grittner
Date:
Michael Paquier <michael.paquier@gmail.com> wrote:

> There is currently no in-core function to query the amount of
> available and free space in a path of PGDATA, with something like that
> for example:
> SELECT * FROM pg_get_diskspace_info('pg_xlog');
>   total_space | free_space
> -------------+------------
>        4812 MB    | 3925 MB
> (1 row)
>
> This would be definitely useful for monitoring purposes to have a look
> at the disk space bloat in PGDATA, pg_xlog, or even pg_log which are
> usually located on different partitions. Some of my customers have
> requested such a thing for a couple of times,

When I was working with Wisconsin Courts we needed something like
this, and wrote it.  It has been used on hundreds of clusters,
24/7, for years.  I see that the last publicly posted updated was
in 2008, but it likely never needed changes after that.  We used it
on Windows and Linux.  At the time, the community rather actively
rejected incorporating it, but maybe in today's world of extensions
it could be put on pgxn.org.

http://www.postgresql.org/message-id/flat/43FDF6D0.EE98.0025.0@wicourts.gov#43FDF6D0.EE98.0025.0@wicourts.gov

http://pgfoundry.org/projects/fsutil/


The license is BSD, so there should be no problem grabbing the source
and using as much (or as little) as you find helpful.


--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Getting total and free disk space from paths in PGDATA

From
Michael Paquier
Date:
On Wed, Sep 9, 2015 at 6:13 AM, Kevin Grittner wrote:
> Michael Paquier wrote:
> When I was working with Wisconsin Courts we needed something like
> this, and wrote it.  It has been used on hundreds of clusters,
> 24/7, for years.  I see that the last publicly posted updated was
> in 2008, but it likely never needed changes after that.  We used it
> on Windows and Linux.  At the time, the community rather actively
> rejected incorporating it, but maybe in today's world of extensions
> it could be put on pgxn.org.

That's not surprising to see something similar, thanks for pointing it
out! I actually tried duckduckgoing/googling something similar but
even this thread did not show up. Something interesting about the
thread you pointed out is that statvfs is not that portable: OpenBSD
does not include it, and by looking around recent versions don't have
it as well. We could fallback to statfs though.

A major difference between what I got in mind though compared to what
this extension does is to not provide any information about the file
system, just the free and available space, the restrictions of
genfile.c for path strings applying as well. So this seems more
acceptable from the security POV. Currently, we can use pg_stat_file
combined with pg_ls_file to have an estimation of the total space used
by a path, but any credible use case would need to compare this size
with some output from a system call like df.
Regards,
-- 
Michael



Re: Getting total and free disk space from paths in PGDATA

From
Michael Paquier
Date:
On Mon, Sep 7, 2015 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> statvfs is part of the POSIX spec and is "normally" present on modern
>> platforms (BSD, OSX, Linux and Solaris have it as far as I saw, still
>> there may be some strange platform without it).
>
> There are considerably less strange platforms that have per-user
> disk quotas.  I wonder what statvfs does with those.

This does not seem good on this side. On Linux for example statvfs and
statfs return information about the FS and not the quotas related to
the user calling it, which would be what is actually interesting in
our case when PG is run by a user under which hard-limit quotas are
applying :(
-- 
Michael