Re: Discussion: Fast DB/Schema/Table disk size check in Postgresql - Mailing list pgsql-hackers

From Hubert Zhang
Subject Re: Discussion: Fast DB/Schema/Table disk size check in Postgresql
Date
Msg-id CAB0yre=bMiqFCWPpaNQE7TDS9ubhTTbp1hKN_u8f6cXVjHi6mA@mail.gmail.com
Whole thread Raw
In response to Re: Discussion: Fast DB/Schema/Table disk size check in Postgresql  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
Thanks Stephen.
In the largest portion of the system, when it comes to tables, indexes,
and such, if there's a file 'X.1', you can be pretty darn sure that 'X'
is 1GB in size.  If there's a file 'X.245' then you can know that
there's 245 files (X, X.1, X.2, X.3 ... X.244) that are 1GB in size.
Good idea. We could just add your logic in db_dir_size(). to teach both pg_database_size_name() and pg_database_size_oid().
'X' maybe larger than 1GB with custom configuration.
So the size of the table could be calculated with two stats: size(X.245) + 245*size(X)

On Sun, Jan 6, 2019 at 3:52 AM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Hubert Zhang (hzhang@pivotal.io) wrote:
> For very large databases, the dbsize function `pg_database_size_name()`
> etc. could be quite slow, since it needs to call `stat()` system call on
> every file in the target database.

I agree, it'd be nice to improve this.

> We proposed a new solution to accelerate these dbsize functions which check
> the disk usage of database/schema/table. The new functions avoid to call
> `stat()` system call, and instead store the disk usage of database objects
> in user tables(called diskquota.xxx_size, xxx could be db/schema/table).
> Checking the size of database 'postgres' could be converted to the SQL
> query `select size from diskquota.db_size where name = `postgres``.

This seems like an awful lot of work though.

I'd ask a different question- why do we need to stat() every file?

In the largest portion of the system, when it comes to tables, indexes,
and such, if there's a file 'X.1', you can be pretty darn sure that 'X'
is 1GB in size.  If there's a file 'X.245' then you can know that
there's 245 files (X, X.1, X.2, X.3 ... X.244) that are 1GB in size.

Maybe we should teach pg_database_size_name() about that?

Thanks!

Stephen


--
Thanks

Hubert Zhang

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0
Next
From: Amit Khandekar
Date:
Subject: Re: Displaying and dumping of table access methods