Available disk space per tablespace

Started by Christoph Bergabout 1 year ago10 messageshackers
Jump to latest
#1Christoph Berg
myon@debian.org

Hi,

I'm picking up a 5 year old patch again:
/messages/by-id/20191108132419.GG8017@msg.df7cb.de

Users will be interested in knowing how much extra data they can load
into a database, but PG currently does not expose that number. This
patch introduces a new function pg_tablespace_avail() that takes a
tablespace name or oid, and returns the number of bytes "available"
there. This is the number without any reserved blocks (Unix, f_avail)
or available to the current user (Windows).

(This is not meant to replace a full-fledged OS monitoring system that
has much more numbers about disks and everything, it is filling a UX
gap.)

Compared to the last patch, this just returns a single number so it's
easier to use - total space isn't all that interesting, we just return
the number the user wants.

The free space is included in \db+ output:

postgres =# \db+
List of tablespaces
Name │ Owner │ Location │ Access privileges │ Options │ Size │ Free │ Description
────────────┼───────┼──────────┼───────────────────┼─────────┼─────────┼────────┼─────────────
pg_default │ myon │ │ ∅ │ ∅ │ 23 MB │ 538 GB │ ∅
pg_global │ myon │ │ ∅ │ ∅ │ 556 kB │ 538 GB │ ∅
spc │ myon │ /tmp/spc │ ∅ │ ∅ │ 0 bytes │ 31 GB │ ∅
(3 rows)

The patch has also been tested on Windows.

TODO: Figure out which systems need statfs() vs statvfs()

Christoph

Attachments:

0001-Add-pg_tablespace_avail-functions.patchtext/x-diff; charset=us-asciiDownload+132-5
#2Quan Zongliang
quanzongliang@yeah.net
In reply to: Christoph Berg (#1)
Re: Available disk space per tablespace

On 2025/3/14 02:10, Christoph Berg wrote:

Hi,

I'm picking up a 5 year old patch again:
/messages/by-id/20191108132419.GG8017@msg.df7cb.de

Users will be interested in knowing how much extra data they can load
into a database, but PG currently does not expose that number. This
patch introduces a new function pg_tablespace_avail() that takes a
tablespace name or oid, and returns the number of bytes "available"
there. This is the number without any reserved blocks (Unix, f_avail)
or available to the current user (Windows).

(This is not meant to replace a full-fledged OS monitoring system that
has much more numbers about disks and everything, it is filling a UX
gap.)

Compared to the last patch, this just returns a single number so it's
easier to use - total space isn't all that interesting, we just return
the number the user wants.

The free space is included in \db+ output:

postgres =# \db+
List of tablespaces
Name │ Owner │ Location │ Access privileges │ Options │ Size │ Free │ Description
────────────┼───────┼──────────┼───────────────────┼─────────┼─────────┼────────┼─────────────
pg_default │ myon │ │ ∅ │ ∅ │ 23 MB │ 538 GB │ ∅
pg_global │ myon │ │ ∅ │ ∅ │ 556 kB │ 538 GB │ ∅
spc │ myon │ /tmp/spc │ ∅ │ ∅ │ 0 bytes │ 31 GB │ ∅
(3 rows)

The patch has also been tested on Windows.

TODO: Figure out which systems need statfs() vs statvfs()

I tested the patch under macos. Abnormal work:

List of tablespaces
Name | Owner | Location | Access privileges | Options | Size
| Free | Description
------------+--------+----------+-------------------+---------+--------+-------+-------------
pg_default | quanzl | | | | 23 MB
|23 TB |
pg_global | quanzl | | | | 556 kB
| 23 TB |
(2 rows)
Actually my disk is 1TB.

According to the statvfs documentation for macOS
f_frsize The size in bytes of the minimum unit of allocation on
this file system.
f_bsize The preferred length of I/O requests for files on this
file system.

I tweaked the code a little bit. See the attachment.
List of tablespaces
Name | Owner | Location | Access privileges | Options | Size
| Free | Description
------------+--------+----------+-------------------+---------+--------+--------+-------------
pg_default | quanzl | | | | 22 MB
| 116 GB |
pg_global | quanzl | | | | 556 kB
| 116 GB |
(2 rows)

In addition, many systems use 1000 as 1k to represent the storage size.
Shouldn't we consider this factor as well?

Show quoted text

Christoph

Attachments:

0002-Add-pg_tablespace_avail-functions.patchtext/plain; charset=UTF-8; name=0002-Add-pg_tablespace_avail-functions.patchDownload+136-4
#3Christoph Berg
myon@debian.org
In reply to: Quan Zongliang (#2)
Re: Available disk space per tablespace

Re: Quan Zongliang

According to the statvfs documentation for macOS
f_frsize The size in bytes of the minimum unit of allocation on this
file system.
f_bsize The preferred length of I/O requests for files on this file
system.

Thanks for catching that. f_frsize is the correct field to use. The
statvfs(3) manpage on Linux has it as well, but it's less pronounced
there so I missed it:

struct statvfs {
unsigned long f_bsize; /* Filesystem block size */
unsigned long f_frsize; /* Fragment size */
fsblkcnt_t f_blocks; /* Size of fs in f_frsize units */
fsblkcnt_t f_bfree; /* Number of free blocks */
fsblkcnt_t f_bavail; /* Number of free blocks for
unprivileged users */

In addition, many systems use 1000 as 1k to represent the storage size.
Shouldn't we consider this factor as well?

That would be a different pg_size_pretty() function, unrelated to this
patch.

I'm still unconvinced if we should use statfs() instead of statvfs()
on *BSD or if their manpage is just trolling us and statvfs is just
fine.

DESCRIPTION
The statvfs() and fstatvfs() functions fill the structure pointed to by
buf with garbage. This garbage will occasionally bear resemblance to
file system statistics, but portable applications must not depend on
this.

Christoph

Attachments:

v3-0001-Add-pg_tablespace_avail-functions.patchtext/x-diff; charset=us-asciiDownload+163-5
#4Thomas Munro
thomas.munro@gmail.com
In reply to: Christoph Berg (#3)
Re: Available disk space per tablespace

On Sat, Mar 15, 2025 at 4:40 AM Christoph Berg <myon@debian.org> wrote:

I'm still unconvinced if we should use statfs() instead of statvfs()
on *BSD or if their manpage is just trolling us and statvfs is just
fine.

DESCRIPTION
The statvfs() and fstatvfs() functions fill the structure pointed to by
buf with garbage. This garbage will occasionally bear resemblance to
file system statistics, but portable applications must not depend on
this.

Hah, I see this in my local FreeBSD man page. I guess this might be a
reference to POSIX's 100% get-out clause "it is unspecified whether
all members of the statvfs structure have meaningful values on all
file systems". The statfs() man page doesn't say that (a nonstandard
syscall that originated in 4.4BSD, which POSIX decided to rename
because other systems sprouted incompatible statfs() interfaces?).
It's hard to imagine a system that doesn't track free space and report
it here, and if it doesn't, well so what, that's probably also a
system that can't report free space to the "df" command, so what are
we supposed to do? We could perhaps add a note to the documentation
that this field relies on the OS providing meaningful "avail" field in
statvfs(), but it's hard to imagine. Maybe just defer that until
someone shows up with a real report? So +1 from me, go for it, call
statvfs() and don't worry.

I tried your v3 patch on my FreeBSD 14.2 battle station:

postgres=# \db+
List of tablespaces
Name | Owner | Location | Access privileges | Options | Size
| Free | Description
------------+--------+----------+-------------------+---------+--------+--------+-------------
pg_default | tmunro | | | | 22 MB
| 290 GB |
pg_global | tmunro | | | | 556 kB
| 290 GB |

That is the correct answer:

tmunro@build1:~/projects/postgresql/build $ df -h .
Filesystem Size Used Avail Capacity Mounted on
zroot/usr/home 331G 41G 290G 12% /usr/home

I also pushed your patch to CI and triggered the NetBSD and OpenBSD
tasks and they passed your sanity test, though that only checks that
the reported some number > 1MB.

I looked at the source, and on FreeBSD statvfs[1]https://github.com/freebsd/freebsd-src/blob/36782aaba4f1a7d054aa405357a8fa2bc0f94eb0/lib/libc/gen/statvfs.c#L70 is just a libc
function that calls statfs() (as does df). The statfs() man page has
no funny disclaimers. OpenBSD's[2]https://github.com/openbsd/src/blob/70ab9842eb8b368612eb098db19dcf94c19d673d/lib/libc/gen/statvfs.c#L59 too. NetBSD seems to have a real
statvfs (or statvfs1) syscall but its man page has no funny
disclaimers.

+#ifdef WIN32
+       if (GetDiskFreeSpaceEx(tblspcPath, &lpFreeBytesAvailable,
NULL, NULL) == false)
+               return -1;
+
+       return lpFreeBytesAvailable.QuadPart; /* ULONGLONG part of
ULARGE_INTEGER */
+#else
+       if (statvfs(tblspcPath, &fst) < 0)
+               return -1;
+
+       return fst.f_bavail * fst.f_frsize; /* available blocks times
fragment size */
+#endif

What's the rationale for not raising an error if the system call
fails? If someone complains that it's showing -1, doesn't that mean
we'll have to ask them to trace the system calls to figure out why, or
if it's Windows, likely abandon all hope of ever knowing why? Should
statvfs() retry on EINTR?

Style nit: maybe ! instead of == false?

Nice feature.

[1]: https://github.com/freebsd/freebsd-src/blob/36782aaba4f1a7d054aa405357a8fa2bc0f94eb0/lib/libc/gen/statvfs.c#L70
[2]: https://github.com/openbsd/src/blob/70ab9842eb8b368612eb098db19dcf94c19d673d/lib/libc/gen/statvfs.c#L59

#5Christoph Berg
myon@debian.org
In reply to: Thomas Munro (#4)
Re: Available disk space per tablespace

Re: Thomas Munro

Hah, I see this in my local FreeBSD man page. I guess this might be a
reference to POSIX's 100% get-out clause "it is unspecified whether
all members of the statvfs structure have meaningful values on all
file systems".

Yeah I could hear someone being annoyed by POSIX echoed in that
paragraph.

system that can't report free space to the "df" command, so what are
we supposed to do? We could perhaps add a note to the documentation
that this field relies on the OS providing meaningful "avail" field in
statvfs(), but it's hard to imagine. Maybe just defer that until
someone shows up with a real report? So +1 from me, go for it, call
statvfs() and don't worry.

I was reading looking into gnulib's wrapper around this - it's also
basically calling statvfs() except on assorted older systems.

https://github.com/coreutils/gnulib/blob/master/lib/fsusage.c#L114

Do we care about any of these?

AIX
OSF/1
2.6 < glibc/Linux < 2.6.36
glibc/Linux < 2.6, 4.3BSD, SunOS 4, \
Mac OS X < 10.4, FreeBSD < 5.0, \
NetBSD < 3.0, OpenBSD < 4.4k
SunOS 4.1.2, 4.1.3, and 4.1.3_U1
4.4BSD and older NetBSD
SVR3, old Irix

If not, then statvfs seems safe.

I also pushed your patch to CI and triggered the NetBSD and OpenBSD
tasks and they passed your sanity test, though that only checks that
the reported some number > 1MB.

I thought about making that test "between 1MB and 10PB", but that
seemed silly - it's not testing much, and some day, someone will try
to run the test on a system where it will still fail.

What's the rationale for not raising an error if the system call
fails?

That's mirroring the behavior of calculate_tablespace_size() in the
same file. I thought that's to allow \db+ to succeed even if some of
the tablespaces are botched/missing/whatever. But now on closer
inspection, I see that db_dir_size() is erroring out on problems, it
just ignores the top-level directory missing. Fixed in the attached
patch.

\db+
FEHLER: XX000: could not statvfs directory "pg_tblspc/16384/PG_18_202503111": Zu viele Ebenen aus symbolischen Links
LOCATION: calculate_tablespace_avail, dbsize.c:373

But this is actually something I wanted to address in a follow-up
patch: Currently, non-superusers cannot run \db+ because they lack
CREATE on pg_global (but `\db+ pg_default` works). Should we rather
make pg_database_size and pg_database_avail return NULL for
insufficient permissions instead of throwing an error?

If someone complains that it's showing -1, doesn't that mean

(-1 is translated to NULL for the SQL level.)

we'll have to ask them to trace the system calls to figure out why, or
if it's Windows, likely abandon all hope of ever knowing why? Should
statvfs() retry on EINTR?

Hmm. Is looping on EINTR worth the trouble?

Style nit: maybe ! instead of == false?

Changed.

Nice feature.

Thanks!

Christoph

Attachments:

v4-0001-Add-pg_tablespace_avail-functions.patchtext/x-diff; charset=us-asciiDownload+171-5
#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Christoph Berg (#5)
Re: Available disk space per tablespace

On Sat, 2025-03-15 at 13:09 +0100, Christoph Berg wrote:

Do we care about any of these?

AIX

We dropped support for it, but there are efforts to change that.

Yours,
Laurenz Albe

#7Thomas Munro
thomas.munro@gmail.com
In reply to: Laurenz Albe (#6)
Re: Available disk space per tablespace

On Sun, Mar 16, 2025 at 1:17 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Sat, 2025-03-15 at 13:09 +0100, Christoph Berg wrote:

Do we care about any of these?

AIX

We dropped support for it, but there are efforts to change that.

FWIW AIX does have it, according to its manual, in case it comes back.
The others in the list are defunct or obsolete versions.

#8Thomas Munro
thomas.munro@gmail.com
In reply to: Christoph Berg (#5)
Re: Available disk space per tablespace

On Sun, Mar 16, 2025 at 1:09 AM Christoph Berg <myon@debian.org> wrote:

Hmm. Is looping on EINTR worth the trouble?

I was just wondering if it might be one of those oddballs that ignores
SA_RESTART, but I guess that doesn't seem too likely (I mean, first
you'd probably have to have a reason to sleep or some other special
reason, and who knows what some unusual file systems might do). It
certainly doesn't on the systems I tried. So I guess not until we
have other evidence.

#9Christoph Berg
myon@debian.org
In reply to: Thomas Munro (#8)
Re: Available disk space per tablespace

Re: Thomas Munro

Hmm. Is looping on EINTR worth the trouble?

I was just wondering if it might be one of those oddballs that ignores
SA_RESTART, but I guess that doesn't seem too likely (I mean, first
you'd probably have to have a reason to sleep or some other special
reason, and who knows what some unusual file systems might do). It
certainly doesn't on the systems I tried. So I guess not until we
have other evidence.

Gnulib's get_fs_usage() (which is what GNU coreutil's df uses)
does not handle EINTR either.

There is some code that does int width expansion, but I believe we
don't need that since the `fst.f_bavail * fst.f_frsize` multiplication
takes care of converting that to int64 (if it wasn't already 64bits
before).

Christoph

#10oyoun
oyoun@gmx.com
In reply to: Christoph Berg (#1)
Re: Available disk space per tablespace

Hi,

I also tested the patch on Linux mint 22.1 with the btrfs and ext4
partitions. I generated some data and the outcome looks good:

postgres=# \db+
                                                   List of tablespaces
       Name       |  Owner   |         Location          | Access
privileges | Options |  Size   |  Free   | Description
------------------+----------+---------------------------+-------------------+---------+---------+---------+-------------
 pg_default       | postgres | |                   |         | 1972 MB
| 29 GB   |
 pg_global        | postgres | |                   |         | 556 kB 
| 29 GB   |
 tablespace_test2 | postgres | /media/said/queryme/pgsql
|                   |         | 3147 MB | 1736 GB |

Numbers are the same as if I were executing the command: df -h

tablespace_test2 was the ext4 partition on usb stick.

Numbers are correct.

Said

Show quoted text

On 2025-03-13 14 h 10, Christoph Berg wrote:

Hi,

I'm picking up a 5 year old patch again:
/messages/by-id/20191108132419.GG8017@msg.df7cb.de

Users will be interested in knowing how much extra data they can load
into a database, but PG currently does not expose that number. This
patch introduces a new function pg_tablespace_avail() that takes a
tablespace name or oid, and returns the number of bytes "available"
there. This is the number without any reserved blocks (Unix, f_avail)
or available to the current user (Windows).

(This is not meant to replace a full-fledged OS monitoring system that
has much more numbers about disks and everything, it is filling a UX
gap.)

Compared to the last patch, this just returns a single number so it's
easier to use - total space isn't all that interesting, we just return
the number the user wants.

The free space is included in \db+ output:

postgres =# \db+
List of tablespaces
Name │ Owner │ Location │ Access privileges │ Options │ Size │ Free │ Description
────────────┼───────┼──────────┼───────────────────┼─────────┼─────────┼────────┼─────────────
pg_default │ myon │ │ ∅ │ ∅ │ 23 MB │ 538 GB │ ∅
pg_global │ myon │ │ ∅ │ ∅ │ 556 kB │ 538 GB │ ∅
spc │ myon │ /tmp/spc │ ∅ │ ∅ │ 0 bytes │ 31 GB │ ∅
(3 rows)

The patch has also been tested on Windows.

TODO: Figure out which systems need statfs() vs statvfs()

Christoph