Thread: Available disk space per tablespace

Available disk space per tablespace

From
Christoph Berg
Date:
Hi,

I'm picking up a 5 year old patch again:
https://www.postgresql.org/message-id/flat/20191108132419.GG8017%40msg.df7cb.de

Users will be interested in knowing how much extra data they can load
into a database, but PG currently does not expose that number. This
patch introduces a new function pg_tablespace_avail() that takes a
tablespace name or oid, and returns the number of bytes "available"
there. This is the number without any reserved blocks (Unix, f_avail)
or available to the current user (Windows).

(This is not meant to replace a full-fledged OS monitoring system that
has much more numbers about disks and everything, it is filling a UX
gap.)

Compared to the last patch, this just returns a single number so it's
easier to use - total space isn't all that interesting, we just return
the number the user wants.

The free space is included in \db+ output:

postgres =# \db+
                                     List of tablespaces
    Name    │ Owner │ Location │ Access privileges │ Options │  Size   │  Free  │ Description
────────────┼───────┼──────────┼───────────────────┼─────────┼─────────┼────────┼─────────────
 pg_default │ myon  │          │ ∅                 │ ∅       │ 23 MB   │ 538 GB │ ∅
 pg_global  │ myon  │          │ ∅                 │ ∅       │ 556 kB  │ 538 GB │ ∅
 spc        │ myon  │ /tmp/spc │ ∅                 │ ∅       │ 0 bytes │ 31 GB  │ ∅
(3 rows)

The patch has also been tested on Windows.

TODO: Figure out which systems need statfs() vs statvfs()

Christoph

Attachment

Re: Available disk space per tablespace

From
Quan Zongliang
Date:

On 2025/3/14 02:10, Christoph Berg wrote:
> Hi,
> 
> I'm picking up a 5 year old patch again:
> https://www.postgresql.org/message-id/flat/20191108132419.GG8017%40msg.df7cb.de
> 
> Users will be interested in knowing how much extra data they can load
> into a database, but PG currently does not expose that number. This
> patch introduces a new function pg_tablespace_avail() that takes a
> tablespace name or oid, and returns the number of bytes "available"
> there. This is the number without any reserved blocks (Unix, f_avail)
> or available to the current user (Windows).
> 
> (This is not meant to replace a full-fledged OS monitoring system that
> has much more numbers about disks and everything, it is filling a UX
> gap.)
> 
> Compared to the last patch, this just returns a single number so it's
> easier to use - total space isn't all that interesting, we just return
> the number the user wants.
> 
> The free space is included in \db+ output:
> 
> postgres =# \db+
>                                       List of tablespaces
>      Name    │ Owner │ Location │ Access privileges │ Options │  Size   │  Free  │ Description
> ────────────┼───────┼──────────┼───────────────────┼─────────┼─────────┼────────┼─────────────
>   pg_default │ myon  │          │ ∅                 │ ∅       │ 23 MB   │ 538 GB │ ∅
>   pg_global  │ myon  │          │ ∅                 │ ∅       │ 556 kB  │ 538 GB │ ∅
>   spc        │ myon  │ /tmp/spc │ ∅                 │ ∅       │ 0 bytes │ 31 GB  │ ∅
> (3 rows)
> 
> The patch has also been tested on Windows.
> 
> TODO: Figure out which systems need statfs() vs statvfs()
> 

I tested the patch under macos. Abnormal work:

                                      List of tablespaces
     Name    | Owner  | Location | Access privileges | Options |  Size 
| Free  | Description
------------+--------+----------+-------------------+---------+--------+-------+-------------
  pg_default | quanzl |          |                   |         | 23 MB 
|23 TB |
  pg_global  | quanzl |          |                   |         | 556 kB 
| 23 TB |
(2 rows)
Actually my disk is 1TB.

According to the statvfs documentation for macOS
f_frsize       The size in bytes of the minimum unit of allocation on 
this file system.
f_bsize        The preferred length of I/O requests for files on this 
file system.

I tweaked the code a little bit. See the attachment.
                                      List of tablespaces
     Name    | Owner  | Location | Access privileges | Options |  Size 
|  Free  | Description
------------+--------+----------+-------------------+---------+--------+--------+-------------
  pg_default | quanzl |          |                   |         | 22 MB 
| 116 GB |
  pg_global  | quanzl |          |                   |         | 556 kB 
| 116 GB |
(2 rows)

In addition, many systems use 1000 as 1k to represent the storage size. 
Shouldn't we consider this factor as well?

> Christoph

Attachment

Re: Available disk space per tablespace

From
Christoph Berg
Date:
Re: Quan Zongliang
> According to the statvfs documentation for macOS
> f_frsize       The size in bytes of the minimum unit of allocation on this
> file system.
> f_bsize        The preferred length of I/O requests for files on this file
> system.

Thanks for catching that. f_frsize is the correct field to use. The
statvfs(3) manpage on Linux has it as well, but it's less pronounced
there so I missed it:

          struct statvfs {
               unsigned long  f_bsize;    /* Filesystem block size */
               unsigned long  f_frsize;   /* Fragment size */
               fsblkcnt_t     f_blocks;   /* Size of fs in f_frsize units */
               fsblkcnt_t     f_bfree;    /* Number of free blocks */
               fsblkcnt_t     f_bavail;   /* Number of free blocks for
                                             unprivileged users */


> In addition, many systems use 1000 as 1k to represent the storage size.
> Shouldn't we consider this factor as well?

That would be a different pg_size_pretty() function, unrelated to this
patch.

I'm still unconvinced if we should use statfs() instead of statvfs()
on *BSD or if their manpage is just trolling us and statvfs is just
fine.

DESCRIPTION
       The statvfs() and fstatvfs() functions fill the structure pointed to by
       buf  with  garbage.  This garbage will occasionally bear    resemblance to
       file system statistics, but portable applications must  not  depend  on
       this.

Christoph

Attachment

Re: Available disk space per tablespace

From
Thomas Munro
Date:
On Sat, Mar 15, 2025 at 4:40 AM Christoph Berg <myon@debian.org> wrote:
> I'm still unconvinced if we should use statfs() instead of statvfs()
> on *BSD or if their manpage is just trolling us and statvfs is just
> fine.
>
> DESCRIPTION
>        The statvfs() and fstatvfs() functions fill the structure pointed to by
>        buf  with  garbage.  This garbage will occasionally bear resemblance to
>        file system statistics, but portable applications must  not  depend  on
>        this.

Hah, I see this in my local FreeBSD man page.  I guess this might be a
reference to POSIX's 100% get-out clause "it is unspecified whether
all members of the statvfs structure have meaningful values on all
file systems".  The statfs() man page doesn't say that (a nonstandard
syscall that originated in 4.4BSD, which POSIX decided to rename
because other systems sprouted incompatible statfs() interfaces?).
It's hard to imagine a system that doesn't track free space and report
it here, and if it doesn't, well so what, that's probably also a
system that can't report free space to the "df" command, so what are
we supposed to do?  We could perhaps add a note to the documentation
that this field relies on the OS providing meaningful "avail" field in
statvfs(), but it's hard to imagine.  Maybe just defer that until
someone shows up with a real report?  So +1 from me, go for it, call
statvfs() and don't worry.

I tried your v3 patch on my FreeBSD 14.2 battle station:

postgres=# \db+
                                     List of tablespaces
    Name    | Owner  | Location | Access privileges | Options |  Size
|  Free  | Description
------------+--------+----------+-------------------+---------+--------+--------+-------------
 pg_default | tmunro |          |                   |         | 22 MB
| 290 GB |
 pg_global  | tmunro |          |                   |         | 556 kB
| 290 GB |

That is the correct answer:

tmunro@build1:~/projects/postgresql/build $ df -h .
Filesystem        Size    Used   Avail Capacity  Mounted on
zroot/usr/home    331G     41G    290G    12%    /usr/home

I also pushed your patch to CI and triggered the NetBSD and OpenBSD
tasks and they passed your sanity test, though that only checks that
the reported some number >  1MB.

I looked at the source, and on FreeBSD statvfs[1] is just a libc
function that calls statfs() (as does df).  The statfs() man page has
no funny disclaimers.  OpenBSD's[2] too.  NetBSD seems to have a real
statvfs (or statvfs1) syscall but its man page has no funny
disclaimers.

+#ifdef WIN32
+       if (GetDiskFreeSpaceEx(tblspcPath, &lpFreeBytesAvailable,
NULL, NULL) == false)
+               return -1;
+
+       return lpFreeBytesAvailable.QuadPart; /* ULONGLONG part of
ULARGE_INTEGER */
+#else
+       if (statvfs(tblspcPath, &fst) < 0)
+               return -1;
+
+       return fst.f_bavail * fst.f_frsize; /* available blocks times
fragment size */
+#endif

What's the rationale for not raising an error if the system call
fails?  If someone complains that it's showing -1, doesn't that mean
we'll have to ask them to trace the system calls to figure out why, or
if it's Windows, likely abandon all hope of ever knowing why?  Should
statvfs() retry on EINTR?

Style nit: maybe ! instead of == false?

Nice feature.

[1] https://github.com/freebsd/freebsd-src/blob/36782aaba4f1a7d054aa405357a8fa2bc0f94eb0/lib/libc/gen/statvfs.c#L70
[2] https://github.com/openbsd/src/blob/70ab9842eb8b368612eb098db19dcf94c19d673d/lib/libc/gen/statvfs.c#L59



Re: Available disk space per tablespace

From
Christoph Berg
Date:
Re: Thomas Munro
> Hah, I see this in my local FreeBSD man page.  I guess this might be a
> reference to POSIX's 100% get-out clause "it is unspecified whether
> all members of the statvfs structure have meaningful values on all
> file systems".

Yeah I could hear someone being annoyed by POSIX echoed in that
paragraph.

> system that can't report free space to the "df" command, so what are
> we supposed to do?  We could perhaps add a note to the documentation
> that this field relies on the OS providing meaningful "avail" field in
> statvfs(), but it's hard to imagine.  Maybe just defer that until
> someone shows up with a real report?  So +1 from me, go for it, call
> statvfs() and don't worry.

I was reading looking into gnulib's wrapper around this - it's also
basically calling statvfs() except on assorted older systems.

https://github.com/coreutils/gnulib/blob/master/lib/fsusage.c#L114

Do we care about any of these?

AIX
OSF/1
2.6 < glibc/Linux < 2.6.36
glibc/Linux < 2.6, 4.3BSD, SunOS 4, \
        Mac OS X < 10.4, FreeBSD < 5.0, \
        NetBSD < 3.0, OpenBSD < 4.4k
SunOS 4.1.2, 4.1.3, and 4.1.3_U1
4.4BSD and older NetBSD
SVR3, old Irix

If not, then statvfs seems safe.

> I also pushed your patch to CI and triggered the NetBSD and OpenBSD
> tasks and they passed your sanity test, though that only checks that
> the reported some number >  1MB.

I thought about making that test "between 1MB and 10PB", but that
seemed silly - it's not testing much, and some day, someone will try
to run the test on a system where it will still fail.

> What's the rationale for not raising an error if the system call
> fails?

That's mirroring the behavior of calculate_tablespace_size() in the
same file. I thought that's to allow \db+ to succeed even if some of
the tablespaces are botched/missing/whatever. But now on closer
inspection, I see that db_dir_size() is erroring out on problems, it
just ignores the top-level directory missing. Fixed in the attached
patch.

\db+
FEHLER:  XX000: could not statvfs directory "pg_tblspc/16384/PG_18_202503111": Zu viele Ebenen aus symbolischen Links
LOCATION:  calculate_tablespace_avail, dbsize.c:373

But this is actually something I wanted to address in a follow-up
patch: Currently, non-superusers cannot run \db+ because they lack
CREATE on pg_global (but `\db+ pg_default` works). Should we rather
make pg_database_size and pg_database_avail return NULL for
insufficient permissions instead of throwing an error?

> If someone complains that it's showing -1, doesn't that mean

(-1 is translated to NULL for the SQL level.)

> we'll have to ask them to trace the system calls to figure out why, or
> if it's Windows, likely abandon all hope of ever knowing why?  Should
> statvfs() retry on EINTR?

Hmm. Is looping on EINTR worth the trouble?

> Style nit: maybe ! instead of == false?

Changed.

> Nice feature.

Thanks!

Christoph

Attachment

Re: Available disk space per tablespace

From
Laurenz Albe
Date:
On Sat, 2025-03-15 at 13:09 +0100, Christoph Berg wrote:
> Do we care about any of these?
>
> AIX

We dropped support for it, but there are efforts to change that.

Yours,
Laurenz Albe



Re: Available disk space per tablespace

From
Thomas Munro
Date:
On Sun, Mar 16, 2025 at 1:17 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Sat, 2025-03-15 at 13:09 +0100, Christoph Berg wrote:
> > Do we care about any of these?
> >
> > AIX
>
> We dropped support for it, but there are efforts to change that.

FWIW AIX does have it, according to its manual, in case it comes back.
The others in the list are defunct or obsolete versions.



Re: Available disk space per tablespace

From
Thomas Munro
Date:
On Sun, Mar 16, 2025 at 1:09 AM Christoph Berg <myon@debian.org> wrote:
> Hmm. Is looping on EINTR worth the trouble?

I was just wondering if it might be one of those oddballs that ignores
SA_RESTART, but I guess that doesn't seem too likely (I mean, first
you'd probably have to have a reason to sleep or some other special
reason, and who knows what some unusual file systems might do).  It
certainly doesn't on the systems I tried.  So I guess not until we
have other evidence.



Re: Available disk space per tablespace

From
Christoph Berg
Date:
Re: Thomas Munro
> > Hmm. Is looping on EINTR worth the trouble?
> 
> I was just wondering if it might be one of those oddballs that ignores
> SA_RESTART, but I guess that doesn't seem too likely (I mean, first
> you'd probably have to have a reason to sleep or some other special
> reason, and who knows what some unusual file systems might do).  It
> certainly doesn't on the systems I tried.  So I guess not until we
> have other evidence.

Gnulib's get_fs_usage() (which is what GNU coreutil's df uses)
does not handle EINTR either.

There is some code that does int width expansion, but I believe we
don't need that since the `fst.f_bavail * fst.f_frsize` multiplication
takes care of converting that to int64 (if it wasn't already 64bits
before).

Christoph