Thread: Table size does not include toast size

Table size does not include toast size

From
Rafael Martinez
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello

I wonder why the function pg_relation_size(text) does not take into
account the space used by toast data in a table when returning the space
used by the table.

As an administrator I would expect pg_total_relation_size() to return
data+toast+indexes and pg_relation_size() to return data+toast.

Is this a deliberate decision? Could we change this behavior in the future?

We are using a 8.3 database.

Thanks in advance.
regards,
- --Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway
PGP Public Key: http://folk.uio.no/rafael/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFLL3m+BhuKQurGihQRAgBdAKCV5ZIBJyDOzGWh/En4sTvWSW67ZwCfYoYx
iUYIMJCbk6li2BhYcR7JB5M=
=l2YF
-----END PGP SIGNATURE-----


Re: Table size does not include toast size

From
Tom Lane
Date:
Rafael Martinez <r.m.guerrero@usit.uio.no> writes:
> I wonder why the function pg_relation_size(text) does not take into
> account the space used by toast data in a table when returning the space
> used by the table.

It's not supposed to.  Use pg_total_relation_size if you want a number
that includes index and toast space.
        regards, tom lane


Re: Table size does not include toast size

From
Rafael Martinez
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom Lane wrote:
> Rafael Martinez <r.m.guerrero@usit.uio.no> writes:
>> I wonder why the function pg_relation_size(text) does not take into
>> account the space used by toast data in a table when returning the space
>> used by the table.
> 
> It's not supposed to.  Use pg_total_relation_size if you want a number
> that includes index and toast space.
> 

I am probably missing the point here, why is it not supposed to show the
size of the table(data) *without* indexes?

My question was because I can not understand the use and usefulness of
pg_relation_size() (as it works today) in a table that use toast.

- From an administrator point of view, there are two numbers that are
interesting, the total size of a table (indexes included) and the size
of the table without taking into account the space used by its indexes.

At least, if there is a logic in this behavior, it should be documented
in "9.23. System Administration Functions". The documentation only says
"Disk space used by the table or index with ...."

It is not the first time confused users have asked me why
pg_relation_size() does not show the space used by the table without
indexes. Many do not know what 'toast' is, and most probably they do not
need to know about this either.

regards,
- --Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway
PGP Public Key: http://folk.uio.no/rafael/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFLL5NCBhuKQurGihQRAmtZAJ99wJPKbS1u2RUGxO4G++X7nbqt2gCeJubn
b+328nrEICsXPS7kgD4bq68=
=bBO8
-----END PGP SIGNATURE-----


Re: Table size does not include toast size

From
Bernd Helmle
Date:

--On 21. Dezember 2009 10:01:37 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote:

> It's not supposed to.  Use pg_total_relation_size if you want a number
> that includes index and toast space.

I've created a C-Function a while ago that extracts the TOAST size for a 
given relation. This gave me the opportunity to do a pg_relation_size(oid) 
+ pg_relation_toast_size(oid) for a given table oid to calculate on disk 
data size required by a table. Maybe we should include such a function in 
core?

-- 
Thanks
Bernd


Re: Table size does not include toast size

From
Tom Lane
Date:
Rafael Martinez <r.m.guerrero@usit.uio.no> writes:
> I am probably missing the point here, why is it not supposed to show the
> size of the table(data) *without* indexes?

Because pg_relation_size is defined at the "physical" level of showing
one relation, where relation means a pg_class entry.  If you want
agglomerations of multiple relations, you can use
pg_total_relation_size, or build your own total if you have some other
usage in mind.  The one you propose seems fairly arbitrary --- for
example, if it includes the toast relation, why not the toast relation's
index too?  It's not like either one is optional from the user's
standpoint.
        regards, tom lane


Re: Table size does not include toast size

From
Rafael Martinez
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bernd Helmle wrote:
> 
> 
> --On 21. Dezember 2009 10:01:37 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
>> It's not supposed to.  Use pg_total_relation_size if you want a number
>> that includes index and toast space.
> 
> I've created a C-Function a while ago that extracts the TOAST size for a
> given relation. This gave me the opportunity to do a
> pg_relation_size(oid) + pg_relation_toast_size(oid) for a given table
> oid to calculate on disk data size required by a table. Maybe we should
> include such a function in core?
> 

It is a possibility. But I really think that pg_relation_size() not
reporting the total size of the table (without indexes) is useless.

toast is an internal way of organizing/saving data for tuples larger
than the page size used by PostgreSQL. It is a mechanism transparent to
the user and therefore pg_relation_size() should not differentiate
between data saved via toast or not.

The size of the table without the indexes should be reported regardless
the technique used to save the data on the disk.

regards,
- --Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway
PGP Public Key: http://folk.uio.no/rafael/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFLL5ZHBhuKQurGihQRAoR8AJ97RoST3VHGCmcIOhkdRbJIWb3mnwCeN7Mm
7Oja4kmyrQfM6/RxyUE4K2A=
=kxO9
-----END PGP SIGNATURE-----


Re: Table size does not include toast size

From
Rafael Martinez
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom Lane wrote:
> Rafael Martinez <r.m.guerrero@usit.uio.no> writes:
>> I am probably missing the point here, why is it not supposed to show the
>> size of the table(data) *without* indexes?
> 
> Because pg_relation_size is defined at the "physical" level of showing
> one relation, where relation means a pg_class entry.  If you want
> agglomerations of multiple relations, you can use
> pg_total_relation_size, 


Ok, thanks for the clarification :-)

The 'problem' is that as a developer with advanced knowledge of the
postgres internals, you see a table as a group of relations (toast,
indexes, toast relation's index, etc)

A 'normal' user only sees a table and its indexes and this user
misinterpret the use of the function "pg_relation_size() when it reads
in the documentation "pg_relation_size(): Disk space used by the table
or index ... "

regards,
- --Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway
PGP Public Key: http://folk.uio.no/rafael/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFLL5lwBhuKQurGihQRApt1AJ4wQS9+WSiUSAB6sSV6i/z0y0gZhwCfWq1Y
BnnbddNedMMGCUGJ+X4eMMY=
=yUsa
-----END PGP SIGNATURE-----


Re: Table size does not include toast size

From
Greg Smith
Date:
Bernd Helmle wrote:
> I've created a C-Function a while ago that extracts the TOAST size for 
> a given relation. This gave me the opportunity to do a 
> pg_relation_size(oid) + pg_relation_toast_size(oid) for a given table 
> oid to calculate on disk data size required by a table. Maybe we 
> should include such a function in core?

Writing such a thing is already on my to-do list; it's absolutely a 
missing piece of the puzzle here.  If you've got such a patch, by all 
means submit that.  I just ran into my first heavily TOASTy database 
recently and the way I'm computing sizes on the relations there is too 
complicated for my tastes, so it's completely unreasonable to expect 
regular users to do that.

To answer Rafael's concerns directly:  you're right that this is 
confusing.  pg_relation_size is always going to do what it does right 
now just because of how that fits into the design of the database.  
However, the documentation should be updated to warn against the issue 
with TOAST here.  And it should be easier to get the total you're like 
to see here:  main relation + toasted parts, since that's what most DBAs 
want in this area.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: Table size does not include toast size

From
Tom Lane
Date:
Greg Smith <greg@2ndquadrant.com> writes:
> To answer Rafael's concerns directly:  you're right that this is 
> confusing.  pg_relation_size is always going to do what it does right 
> now just because of how that fits into the design of the database.  
> However, the documentation should be updated to warn against the issue 
> with TOAST here.  And it should be easier to get the total you're like 
> to see here:  main relation + toasted parts, since that's what most DBAs 
> want in this area.

Perhaps invent    pg_table_size() = base table + toast table + toast index
and        pg_indexes_size() = all other indexes for table
giving us the property pg_table_size + pg_indexes_size =
pg_total_relation_size

I think the 8.4 documentation already makes it apparent that
pg_relation_size is a pretty low-level number.  If we invent other
functions with obvious names, that should be sufficient.
        regards, tom lane


Re: Table size does not include toast size

From
Greg Smith
Date:
Tom Lane wrote:
> Perhaps invent    pg_table_size() = base table + toast table + toast index
> and        pg_indexes_size() = all other indexes for table
> giving us the property pg_table_size + pg_indexes_size =
> pg_total_relation_size
>   
Right; that's exactly the way I'm computing things now, I just have to 
crawl way too much catalog data to do it.  I also agree that if we 
provide pg_table_size, the issue of "pg_relation_size doesn't do what I 
want" goes away without needing to even change the existing 
documentation--people don't come to that section looking for "relation", 
they're looking for "table".

Bernd, there's a basic spec if you have time to work on this. 

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: Table size does not include toast size

From
Greg Stark
Date:
On Mon, Dec 21, 2009 at 5:02 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Tom Lane wrote:
>>
>> Perhaps invent  pg_table_size() = base table + toast table + toast index
>> and             pg_indexes_size() = all other indexes for table
>> giving us the property pg_table_size + pg_indexes_size =
>> pg_total_relation_size
>>
>
> Right; that's exactly the way I'm computing things now, I just have to crawl
> way too much catalog data to do it.  I also agree that if we provide
> pg_table_size, the issue of "pg_relation_size doesn't do what I want" goes
> away without needing to even change the existing documentation--people don't
> come to that section looking for "relation", they're looking for "table".
>
> Bernd, there's a basic spec if you have time to work on this.

What about, the visibility maps and free space maps?


--
greg


Re: Table size does not include toast size

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> On Mon, Dec 21, 2009 at 5:02 PM, Greg Smith <greg@2ndquadrant.com> wrote:
>> Right; that's exactly the way I'm computing things now, I just have to crawl
>> way too much catalog data to do it. �I also agree that if we provide
>> pg_table_size, the issue of "pg_relation_size doesn't do what I want" goes
>> away without needing to even change the existing documentation--people don't
>> come to that section looking for "relation", they're looking for "table".
>> 
>> Bernd, there's a basic spec if you have time to work on this.

> What about, the visibility maps and free space maps?

Those would be included for each relation, I should think.  The
objective here is not to break things down even more finely than
pg_relation_size does, but to aggregate into terms that are meaningful
to the user --- which is to say, "the table" and "its indexes".
Anything you can't get rid of by dropping indexes/constraints is
part of "the table" at this level of detail.
        regards, tom lane


Re: Table size does not include toast size

From
Cédric Villemain
Date:
2009/12/21 Tom Lane <tgl@sss.pgh.pa.us>:
> Greg Smith <greg@2ndquadrant.com> writes:
>> To answer Rafael's concerns directly:  you're right that this is
>> confusing.  pg_relation_size is always going to do what it does right
>> now just because of how that fits into the design of the database.
>> However, the documentation should be updated to warn against the issue
>> with TOAST here.  And it should be easier to get the total you're like
>> to see here:  main relation + toasted parts, since that's what most DBAs
>> want in this area.
>
> Perhaps invent  pg_table_size() = base table + toast table + toast index
> and             pg_indexes_size() = all other indexes for table
> giving us the property pg_table_size + pg_indexes_size =
> pg_total_relation_size

Did you mean :pg_table_size() = base table + toast tablepg_indexes_size() = base indexes + toast indexes
?

>
> I think the 8.4 documentation already makes it apparent that
> pg_relation_size is a pretty low-level number.  If we invent other
> functions with obvious names, that should be sufficient.
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: Table size does not include toast size

From
Bernd Helmle
Date:

--On 22. Dezember 2009 11:46:32 +0100 Cédric Villemain
<cedric.villemain.debian@gmail.com> wrote:

> Did you mean :
>  pg_table_size() = base table + toast table
>  pg_indexes_size() = base indexes + toast indexes
> ?

Since you always have a toast index automatically it makes sense to include
them in pg_table_size().

--
Thanks
Bernd


Re: Table size does not include toast size

From
Bernd Helmle
Date:

--On 21. Dezember 2009 12:02:02 -0500 Greg Smith <greg@2ndquadrant.com> 
wrote:

> Tom Lane wrote:
>> Perhaps invent    pg_table_size() = base table + toast table + toast index
>> and        pg_indexes_size() = all other indexes for table
>> giving us the property pg_table_size + pg_indexes_size =
>> pg_total_relation_size
>>
> Right; that's exactly the way I'm computing things now, I just have to
> crawl way too much catalog data to do it.  I also agree that if we
> provide pg_table_size, the issue of "pg_relation_size doesn't do what I
> want" goes away without needing to even change the existing
> documentation--people don't come to that section looking for "relation",
> they're looking for "table".
>
> Bernd, there's a basic spec if you have time to work on this.

I see if i can get some time for it during christmas vacation (its on my 
radar for a longer period of time). I'm still working on this NOT NULL 
pg_constraint representation and would like to propose a patch fairly soon 
for this.

-- 
Thanks
Bernd


Re: Table size does not include toast size

From
Tom Lane
Date:
Cédric Villemain <cedric.villemain.debian@gmail.com> writes:
> 2009/12/21 Tom Lane <tgl@sss.pgh.pa.us>:
>> Perhaps invent �pg_table_size() = base table + toast table + toast index
>> and � � � � � � pg_indexes_size() = all other indexes for table
>> giving us the property pg_table_size + pg_indexes_size =
>> pg_total_relation_size

> Did you mean :
>  pg_table_size() = base table + toast table
>  pg_indexes_size() = base indexes + toast indexes
> ?

No.
        regards, tom lane


Re: Table size does not include toast size

From
Bernd Helmle
Date:

--On 22. Dezember 2009 15:11:40 +0100 Bernd Helmle <mailings@oopsware.de>
wrote:

>> Bernd, there's a basic spec if you have time to work on this.
>
> I see if i can get some time for it during christmas vacation (its on my
> radar for a longer period of time). I'm still working on this NOT NULL
> pg_constraint representation and would like to propose a patch fairly
> soon for this.

Since i'm not able to finish those other things in time, i wrapped up my
existing code for this issue and came up with the attached patch, which
should implement the behavior Tom proposed. These are two new functions
pg_table_size() and pg_indexes_size(). This patch also changes
pg_total_relation_size() to be a shorthand for pg_table_size() +
pg_indexes_size().

Barring any objections i'm adding this to the CF.

--
Thanks

    Bernd
Attachment

Re: Table size does not include toast size

From
Greg Smith
Date:
Bernd Helmle wrote:
> These are two new functions pg_table_size() and pg_indexes_size().
> This patch also changes pg_total_relation_size() to be a shorthand for
> pg_table_size() + pg_indexes_size().

Attached is a test program to exercise these new functions.  I
thoroughly abuse generate_series and arrays to create a table with a few
megabytes of both regular and TOAST-ed text, and with two indexes on
it.  Here's the results from a sample run (it's random data so each run
will be a bit different):

pg_relation_size       | 11,755,520
pages_size             | 11,755,520
toast_and_fsm          | 22,159,360
pg_table_size          | 33,914,880
pg_indexes_size        |    524,288
pkey                   |    262,144
i                      |    262,144
pg_total_relation_size | 34,439,168
computed_total         | 34,439,168

This seems to work as expected.  You can see that pg_relation_size gives
a really misleading value for this table, whereas the new pg_table_size
does what DBAs were asking for here.  Having pg_indexes_size around is
handy too.  I looked over the code a bit, everything in the patch looks
clean too.

The only question I'm left with after browsing the patch and staring at
the above results is whether it makes sense to expose a pg_toast_size
function.  That would make the set available here capable of handling
almost every situation somebody might want to know about, making this
area completely done as I see it.  In addition to being a useful
shorthand on its own, that would then allow you to indirectly compute
just the FSM size, which seems like an interesting number to know as
feedback on what VACUUM is up to.  It's easy enough to add, too:  the
calculate_toast_table_size code needed is already in the patch, just
have to add another external function to expose it.

I don't think there's any useful case for further exposing the two
component parts of the toast size.  If you're enough of a hacker to know
what to do with those, you can certainly break them down yourself.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

\x
DROP TABLE test;
CREATE TABLE test(s SERIAL PRIMARY KEY,d TEXT);
CREATE INDEX i on TEST(s);

INSERT INTO test(d) SELECT
  array_to_string(array(
    SELECT
      chr(ascii('A') + (random() * 64)::integer)
    FROM generate_series(1,20000)),'')  -- Size
  FROM generate_series(1,1000);  -- Rows

INSERT INTO test(d) SELECT
  array_to_string(array(
    SELECT
      chr(ascii('A') + (random() * 64)::integer)
    FROM generate_series(1,1000)),'')  -- Size
  FROM generate_series(1,10000);  -- Rows

--insert into test (d) SELECT repeat('xyz123'::text,(1+random() * 1000)::integer) FROM generate_series(1,100000);

ANALYZE test;

SELECT pg_relation_size(relname::regclass),relpages * 8192 AS pages_size FROM pg_class where relname='test';
SELECT pg_table_size('test'::regclass) - pg_relation_size('test'::regclass) AS toast_and_fsm;
SELECT pg_table_size('test'::regclass);
SELECT pg_indexes_size('test'::regclass),pg_relation_size('test_pkey'::regclass) as
pkey,pg_relation_size('i'::regclass)AS i; 
SELECT pg_total_relation_size('test'::regclass),pg_table_size('test'::regclass)+pg_indexes_size('test'::regclass) AS
computed_total;

Re: Table size does not include toast size

From
Tom Lane
Date:
Greg Smith <greg@2ndquadrant.com> writes:
> The only question I'm left with after browsing the patch and staring at 
> the above results is whether it makes sense to expose a pg_toast_size 
> function.  That would make the set available here capable of handling 
> almost every situation somebody might want to know about, making this 
> area completely done as I see it.  In addition to being a useful 
> shorthand on its own, that would then allow you to indirectly compute 
> just the FSM size, which seems like an interesting number to know as 
> feedback on what VACUUM is up to.  It's easy enough to add, too:  the 
> calculate_toast_table_size code needed is already in the patch, just 
> have to add another external function to expose it.

> I don't think there's any useful case for further exposing the two 
> component parts of the toast size.  If you're enough of a hacker to know 
> what to do with those, you can certainly break them down yourself.

Hmm ... those opinions seem a bit contradictory.  If you're enough of
a hacker to know what FSM is, you can subtract off the toast size for
yourself no?

I'm inclined to think that table vs. index is the right level of
abstraction for these functions, and that breaking it down further than
that isn't all that helpful.  We have the bottom-level information
(per-fork relation size) available for those who really want the
details.
        regards, tom lane


Re: Table size does not include toast size

From
Greg Smith
Date:
Tom Lane wrote:
> I'm inclined to think that table vs. index is the right level of
> abstraction for these functions, and that breaking it down further than
> that isn't all that helpful.  We have the bottom-level information
> (per-fork relation size) available for those who really want the
> details.
>   

Fair enough; this certainly knocks off all the important stuff already, 
just wanted final sanity check opinion.  This one is ready for a 
committer to look at now.  My test case seems to work fine with a 
moderately complex set of things to navigate.  The main think I'm not 
familiar enough with to have looked at deeply is exactly how the FSM and 
toast computations are done, to check if there's any corner cases in how 
it navigates forks and such that aren't considered.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: Table size does not include toast size

From
Tom Lane
Date:
Bernd Helmle <mailings@oopsware.de> writes:
> Since i'm not able to finish those other things in time, i wrapped up my 
> existing code for this issue and came up with the attached patch, which 
> should implement the behavior Tom proposed. These are two new functions 
> pg_table_size() and pg_indexes_size(). This patch also changes 
> pg_total_relation_size() to be a shorthand for pg_table_size() + 
> pg_indexes_size().

Applied with minor corrections.
        regards, tom lane