Thread: pg_(total_)relation_size and partitioned tables

pg_(total_)relation_size and partitioned tables

From
Amit Langote
Date:
Hi.

You may have guessed from $subject that the two don't work together.

create table p (a int) partition by list (a);
create table p1 partition of p for values in (1, 2) partition by list (a);
create table p11 partition of p1 for values in (1);
create table p12 partition of p1 for values in (2);
insert into p select i % 2 + 1 from generate_series(1, 1000) i;

On HEAD:

select pg_relation_size('p');
 pg_relation_size
------------------
                0
(1 row)

select pg_total_relation_size('p');
 pg_total_relation_size
------------------------
                      0
(1 row)

After applying the attached patch:

select pg_relation_size('p');
 pg_relation_size
------------------
            49152
(1 row)

select pg_total_relation_size('p');
 pg_total_relation_size
------------------------
                  98304
(1 row)

The patch basically accumulates and returns the sizes of all leaf
partitions when passed a partitioned table.  Will add it to next CF.

Thanks,
Amit

Attachment

Re: pg_(total_)relation_size and partitioned tables

From
Robert Haas
Date:
On Thu, Dec 14, 2017 at 12:23 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> You may have guessed from $subject that the two don't work together.

It works exactly as documented:

pg_total_relation_size(regclass) - Total disk space used by the
specified table, including all indexes and TOAST data

It says nothing about including partitions.  If we change this, then
we certainly need to update the documentation (that might be a good
idea if we decide not to update this).

Personally, I'm -1 on including partitions, because then you can no
longer expect that the sum of pg_total_relation_size(regclass) across
all relations in the database will equal the size of the database
itself.  Partitions will be counted a number of times equal to their
depth in the partitioning hierarchy.  However, I understand that I
might get outvoted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_(total_)relation_size and partitioned tables

From
David Rowley
Date:
On 17 December 2017 at 16:24, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Dec 14, 2017 at 12:23 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> You may have guessed from $subject that the two don't work together.
>
> It works exactly as documented:
>
> pg_total_relation_size(regclass) - Total disk space used by the
> specified table, including all indexes and TOAST data
>
> It says nothing about including partitions.  If we change this, then
> we certainly need to update the documentation (that might be a good
> idea if we decide not to update this).
>
> Personally, I'm -1 on including partitions, because then you can no
> longer expect that the sum of pg_total_relation_size(regclass) across
> all relations in the database will equal the size of the database
> itself.  Partitions will be counted a number of times equal to their
> depth in the partitioning hierarchy.  However, I understand that I
> might get outvoted.

I'd also vote to leave the relation_size functions alone.

Perhaps it's worth thinking of changing pg_table_size() instead. We
have taken measures to try and hide the fact that a table is made up
of a bunch of partitions from the user in some cases, e.g DROP TABLE
works without CASCADE for a partitioned table. I'm sure there are
arguments in both directions here too though.

I generally think of the difference between a relation and a table
similarly to the difference between a tuple and a row. A relation is
just what we use to implement tables, and there can be multiple
relations per table (in the partitioning case), similar to how there
can be multiple tuple versions for a single row. So that might back up
that pg_table_size should include all relations that make up that
table.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: pg_(total_)relation_size and partitioned tables

From
Robert Haas
Date:
On Sun, Dec 17, 2017 at 9:54 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> I'd also vote to leave the relation_size functions alone.
>
> Perhaps it's worth thinking of changing pg_table_size() instead. We
> have taken measures to try and hide the fact that a table is made up
> of a bunch of partitions from the user in some cases, e.g DROP TABLE
> works without CASCADE for a partitioned table. I'm sure there are
> arguments in both directions here too though.

Yeah, I don't really understand why changing pg_table_size() is any
more desirable than changing pg_relation_size().

I mean, we could have a table-size function that takes an array of
things you want to include (indexes, toast, partitions, etc), but
changing the semantics of existing functions seems like it's going to
be more painful than helpful (aside from the arguments I brought up
before).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_(total_)relation_size and partitioned tables

From
Michael Paquier
Date:
On Sun, Dec 17, 2017 at 11:54 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> I'd also vote to leave the relation_size functions alone.

Count me in that bucket as well.

> Perhaps it's worth thinking of changing pg_table_size() instead. We
> have taken measures to try and hide the fact that a table is made up
> of a bunch of partitions from the user in some cases, e.g DROP TABLE
> works without CASCADE for a partitioned table. I'm sure there are
> arguments in both directions here too though.
>
> I generally think of the difference between a relation and a table
> similarly to the difference between a tuple and a row. A relation is
> just what we use to implement tables, and there can be multiple
> relations per table (in the partitioning case), similar to how there
> can be multiple tuple versions for a single row. So that might back up
> that pg_table_size should include all relations that make up that
> table.

The barrier here is thin. What's proposed here is already doable with
a WITH RECURSIVE query. So why not just documenting this query and be
done with it instead of complicating the code? It seems to me that the
performance in calling pg_relation_size() in a cascading times fashion
would not matter much. Or one could invent an additional cascading
option which scans inheritance and/or partition chains, or simply have
a new function.
-- 
Michael


Re: pg_(total_)relation_size and partitioned tables

From
Michael Paquier
Date:
On Mon, Dec 18, 2017 at 9:29 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> The barrier here is thin. What's proposed here is already doable with
> a WITH RECURSIVE query. So why not just documenting this query and be
> done with it instead of complicating the code? It seems to me that the
> performance in calling pg_relation_size() in a cascading times fashion
> would not matter much. Or one could invent an additional cascading
> option which scans inheritance and/or partition chains, or simply have
> a new function.

I just blogged on the matter, and here is one possibility here
compatible with v10:
WITH RECURSIVE partition_info
          (relid,
           relname,
           relsize,
           relispartition,
           relkind) AS (
        SELECT oid AS relid,
               relname,
               pg_relation_size(oid) AS relsize,
               relispartition,
               relkind
        FROM pg_catalog.pg_class
        WHERE relname = 'parent_name' AND
              relkind = 'p'
      UNION ALL
        SELECT
             c.oid AS relid,
             c.relname AS relname,
             pg_relation_size(c.oid) AS relsize,
             c.relispartition AS relispartition,
             c.relkind AS relkind
        FROM partition_info AS p,
             pg_catalog.pg_inherits AS i,
             pg_catalog.pg_class AS c
        WHERE p.relid = i.inhparent AND
             c.oid = i.inhrelid AND
             c.relispartition
      )
    SELECT * FROM partition_info;

This is not really straight-forward. You could as well have the
pg_relation_size call in the outer query.
-- 
Michael


Re: pg_(total_)relation_size and partitioned tables

From
Amit Langote
Date:
Thanks all for your thoughts.

I agree with the Robert's point which both David and Michael seem to agree
with that we shouldn't really be changing what pg_relation_size() is doing
under the covers.  And I guess the same for pg_table_size(), too.  Both of
those functions and their siblings work with relations that possess
on-disk structures and have associated relations (TOAST, indexes) that in
turn possess on-disk structures.  It seems quite clearly documented as
such.  Partitioned tables are different in that they neither possess
on-disk structures nor have any relations (TOAST, indexes) associated
directly with them.  Instead, they have partitions that are the relations
that aforementioned dbsize.c functions are familiar with.

So, I withdraw the patch I originally posted in favor of some other approach.

Reply continues below...

On 2017/12/18 11:51, Michael Paquier wrote:
> On Mon, Dec 18, 2017 at 9:29 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> The barrier here is thin. What's proposed here is already doable with
>> a WITH RECURSIVE query. So why not just documenting this query and be
>> done with it instead of complicating the code? It seems to me that the
>> performance in calling pg_relation_size() in a cascading times fashion
>> would not matter much. Or one could invent an additional cascading
>> option which scans inheritance and/or partition chains, or simply have
>> a new function.
> 
> I just blogged on the matter, and here is one possibility here
> compatible with v10:
> WITH RECURSIVE partition_info
>           (relid,
>            relname,
>            relsize,
>            relispartition,
>            relkind) AS (
>         SELECT oid AS relid,
>                relname,
>                pg_relation_size(oid) AS relsize,
>                relispartition,
>                relkind
>         FROM pg_catalog.pg_class
>         WHERE relname = 'parent_name' AND
>               relkind = 'p'
>       UNION ALL
>         SELECT
>              c.oid AS relid,
>              c.relname AS relname,
>              pg_relation_size(c.oid) AS relsize,
>              c.relispartition AS relispartition,
>              c.relkind AS relkind
>         FROM partition_info AS p,
>              pg_catalog.pg_inherits AS i,
>              pg_catalog.pg_class AS c
>         WHERE p.relid = i.inhparent AND
>              c.oid = i.inhrelid AND
>              c.relispartition
>       )
>     SELECT * FROM partition_info;
> 
> This is not really straight-forward. You could as well have the
> pg_relation_size call in the outer query.

Thanks Michael for coming up with that.

Do you (and/or others) think that's something that we can wrap inside a
built-in function(s), that is, one defined in system_views.sql?  Or if we
decide to have new functions, say, pg_get_partitions() and/or
pg_get_partition_sizes(), we might as well implement them as C functions
inside dbsize.c.  If so, do we have want to have "partition" variants of
all *_size() functions viz. pg_relation_size(), pg_total_relation_size(),
pg_indexes_size(), and pg_table_size()?

Thanks,
Amit



Re: pg_(total_)relation_size and partitioned tables

From
Michael Paquier
Date:
On Mon, Dec 18, 2017 at 2:17 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> Do you (and/or others) think that's something that we can wrap inside a
> built-in function(s), that is, one defined in system_views.sql?  Or if we
> decide to have new functions, say, pg_get_partitions() and/or
> pg_get_partition_sizes(), we might as well implement them as C functions
> inside dbsize.c.  If so, do we have want to have "partition" variants of
> all *_size() functions viz. pg_relation_size(), pg_total_relation_size(),
> pg_indexes_size(), and pg_table_size()?

I can see value in something like Robert is proposing upthread by
having a function one is able to customize to decide what to include
in the calculation. There could be options for at least:
- partitions, or relation cascading.
- index.
- toast tables.
- visibility maps.
- FSM.
The first three ones is what Robert are mentioned, still I would see
the last two ones are interesting things if optional. Or we could have
just a SRF that returns one row per object scanned.
-- 
Michael


Re: pg_(total_)relation_size and partitioned tables

From
Peter Eisentraut
Date:
On 12/18/17 00:17, Amit Langote wrote:
> I agree with the Robert's point which both David and Michael seem to agree
> with that we shouldn't really be changing what pg_relation_size() is doing
> under the covers.  And I guess the same for pg_table_size(), too.  Both of
> those functions and their siblings work with relations that possess
> on-disk structures and have associated relations (TOAST, indexes) that in
> turn possess on-disk structures.  It seems quite clearly documented as
> such.  Partitioned tables are different in that they neither possess
> on-disk structures nor have any relations (TOAST, indexes) associated
> directly with them.  Instead, they have partitions that are the relations
> that aforementioned dbsize.c functions are familiar with.

Here is another idea.  If we had a function

pg_partition_root(regclass) returns regclass

(returning itself for non-partitioned relations), then users can easily
construct queries to get the results they want in different shapes, e.g.,

select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
  from pg_class c
  order by 1

select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
  from pg_class c
  group by 1

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pg_(total_)relation_size and partitioned tables

From
David Rowley
Date:
On 29 December 2017 at 08:03, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> Here is another idea.  If we had a function
>
> pg_partition_root(regclass) returns regclass
>
> (returning itself for non-partitioned relations), then users can easily
> construct queries to get the results they want in different shapes, e.g.,
>
> select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
>   from pg_class c
>   order by 1
>
> select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
>   from pg_class c
>   group by 1

That seems much nicer. I assume "root" would mean the top level
partitioned table. If so, would we also want
pg_partition_parent(regclass)? Or maybe something to control the
number of "levels-up" the function would run for. If we had that then
maybe -1 could mean "go until you find a table with no parent".

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: pg_(total_)relation_size and partitioned tables

From
Peter Eisentraut
Date:
On 12/28/17 16:24, David Rowley wrote:
>> select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
>>   from pg_class c
>>   order by 1
>>
>> select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
>>   from pg_class c
>>   group by 1
> 
> That seems much nicer. I assume "root" would mean the top level
> partitioned table. If so, would we also want
> pg_partition_parent(regclass)? Or maybe something to control the
> number of "levels-up" the function would run for. If we had that then
> maybe -1 could mean "go until you find a table with no parent".

Hmm, we need to think through some scenarios for what one would really
want to do with this functionality.

Clearly, the existing behavior is useful for management tasks like bloat
and vacuum monitoring.

And on the other hand you might want to have a logical view of, how big
is this partitioned table altogether.

But what are the uses for dealing with partial partition hierarchies?
How easy do we need to make that?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pg_(total_)relation_size and partitioned tables

From
Alvaro Herrera
Date:
Peter Eisentraut wrote:

> But what are the uses for dealing with partial partition hierarchies?
> How easy do we need to make that?

If you have multilevel partitioning, say partitions per year per site.
What is the volume of 2017 compared to 2016, on each site?  I don't
think it needs to be super easy, but it should be reasonable.

I think pg_partition_parent() should be a simple function doing one
catalog lookup (already implemented as get_partition_parent(),
but needs a "missing_ok" case), and pg_partition_root() an iterative
version of that.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pg_(total_)relation_size and partitioned tables

From
Peter Eisentraut
Date:
I'm setting this patch to Returned with feedback.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Amit Langote
Date:
On 2018/01/17 22:51, Peter Eisentraut wrote:
> I'm setting this patch to Returned with feedback.

OK.  Sorry that I couldn't reply here since the CF started.  I have
written some code to implement the pg_partition_root() idea you mentioned
upthread earlier this week, but hasn't been able to share it on the list
yet.  I will post it soon and create a new entry in the next commit fest.

Thanks,
Amit



Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Amit Langote
Date:
On 2018/01/02 22:45, Peter Eisentraut wrote:
> On 12/28/17 16:24, David Rowley wrote:
>>> select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
>>>   from pg_class c
>>>   order by 1
>>>
>>> select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
>>>   from pg_class c
>>>   group by 1
>>
>> That seems much nicer. I assume "root" would mean the top level
>> partitioned table. If so, would we also want
>> pg_partition_parent(regclass)? Or maybe something to control the
>> number of "levels-up" the function would run for. If we had that then
>> maybe -1 could mean "go until you find a table with no parent".
> 
> Hmm, we need to think through some scenarios for what one would really
> want to do with this functionality.
> 
> Clearly, the existing behavior is useful for management tasks like bloat
> and vacuum monitoring.
> 
> And on the other hand you might want to have a logical view of, how big
> is this partitioned table altogether.
> 
> But what are the uses for dealing with partial partition hierarchies?
> How easy do we need to make that?

I think having pg_partition_root() and pg_partition_parent() will give
users enough to get useful views as follows:

drop table p;
create table p (a int) partition by list (a);
create table p123 partition of p for values in (1, 2, 3) partition by list
(a);
create table p12 partition of p1 for values in (1, 2) partition by list (a);
create table p12 partition of p123 for values in (1, 2) partition by list (a);
create table p1 partition of p12 for values in (1);
create table p2 partition of p12 for values in (2);
create table p3 partition of p123 for values in (3);

insert into p select 1 from generate_series(1, 100);
insert into p select 2 from generate_series(1, 100);
insert into p select 3 from generate_series(1, 100);

select  pg_partition_root(oid) as root_parent,
        pg_partition_parent(oid) as parent,
        relname as relname,
        pg_total_relation_size(oid) as size
from    pg_class
where   relnamespace = 'public'::regnamespace
order by 4;
 root_parent | parent | relname | size
-------------+--------+---------+------
 p           |        | p       |    0
 p           | p      | p123    |    0
 p           | p123   | p12     |    0
 p           | p12    | p1      | 8192
 p           | p12    | p2      | 8192
 p           | p123   | p3      | 8192
(6 rows)

select  pg_partition_root(oid) as root_parent,
        sum(pg_total_relation_size(oid)) as size
from    pg_class
where   relnamespace = 'public'::regnamespace
group by 1
order by 1;
 root_parent | size
-------------+-------
 p           | 24576
(1 row)

Attached a WIP patch.

Thanks,
Amit

Attachment

Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Michael Paquier
Date:
On Thu, Jan 18, 2018 at 06:54:18PM +0900, Amit Langote wrote:
> I think having pg_partition_root() and pg_partition_parent() will give
> users enough to get useful views as follows:

So... pg_partition_root() gives you access to the highest relation in
the hierarchy, and pg_partition_parent() gives you access to the direct
parent.

> drop table p;
> create table p (a int) partition by list (a);
> create table p123 partition of p for values in (1, 2, 3) partition by list
(a);
> create table p12 partition of p1 for values in (1, 2) partition by list (a);
> create table p12 partition of p123 for values in (1, 2) partition by list (a);
> create table p1 partition of p12 for values in (1);
> create table p2 partition of p12 for values in (2);
> create table p3 partition of p123 for values in (3);

You need to reorder those queries, the creation of the first p12 would
fail as p1 does not exist at this point. Wouldn't also a
pg_partition_tree() be useful? You could shape it as a function which
returns all regclass partitions in the tree as unique entries. Combined
with pg_partition_parent() it can be powerful as it returns NULL for the
partition at the top of the tree. So I think that we could live without
pg_partition_root(). At the end, let's design something which makes
unnecessary the use of WITH RECURSIVE when looking at a full partition
tree to ease the user's life.

Documentation, as well as regression tests, would be welcome :)
--
Michael

Attachment

Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Amit Langote
Date:
Thanks for taking a look.

On 2018/01/19 14:39, Michael Paquier wrote:
> On Thu, Jan 18, 2018 at 06:54:18PM +0900, Amit Langote wrote:
>> I think having pg_partition_root() and pg_partition_parent() will give
>> users enough to get useful views as follows:
> 
> So... pg_partition_root() gives you access to the highest relation in
> the hierarchy, and pg_partition_parent() gives you access to the direct
> parent.

Right.

>> drop table p;
>> create table p (a int) partition by list (a);
>> create table p123 partition of p for values in (1, 2, 3) partition by list
> (a);
>> create table p12 partition of p1 for values in (1, 2) partition by list (a);
>> create table p12 partition of p123 for values in (1, 2) partition by list (a);
>> create table p1 partition of p12 for values in (1);
>> create table p2 partition of p12 for values in (2);
>> create table p3 partition of p123 for values in (3);
> 
> You need to reorder those queries, the creation of the first p12 would
> fail as p1 does not exist at this point.

Oops.  I had copy-pasted above commands from the psql's \s output and
ended up copying the command I didn't intend to.  Here it is again, but
without the mistake I made in my last email:

drop table p;
create table p (a int) partition by list (a);
create table p123 partition of p for values in (1, 2, 3) partition by list
(a);
create table p12 partition of p123 for values in (1, 2) partition by list (a);
create table p1 partition of p12 for values in (1);
create table p2 partition of p12 for values in (2);
create table p3 partition of p123 for values in (3);

> Wouldn't also a
> pg_partition_tree() be useful? You could shape it as a function which
> returns all regclass partitions in the tree as unique entries. Combined
> with pg_partition_parent() it can be powerful as it returns NULL for the
> partition at the top of the tree. So I think that we could live without
> pg_partition_root(). At the end, let's design something which makes
> unnecessary the use of WITH RECURSIVE when looking at a full partition
> tree to ease the user's life.

Do you mean pg_partition_tree(regclass), that returns all partitions in
the partition tree whose root is passed as the parameter?

Perhaps, like the following (roughly implemented in the attached)?

select  pg_partition_root(p) as root_parent,
        pg_partition_parent(p) as parent,
        p as relname,
        pg_total_relation_size(p) as size
from    pg_partition_tree_tables('p') p
order by 4;
 root_parent | parent | relname |  size
-------------+--------+---------+---------
 p           |        | p       |       0
 p           | p      | p123    |       0
 p           | p123   | p12     |       0
 p           | p123   | p3      | 3653632
 p           | p12    | p1      | 3653632
 p           | p12    | p2      | 3653632
(6 rows)

> Documentation, as well as regression tests, would be welcome :)

OK, I will add those things in the next version.

Thanks,
Amit

Attachment

Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Michael Paquier
Date:
On Fri, Jan 19, 2018 at 06:28:41PM +0900, Amit Langote wrote:
> Do you mean pg_partition_tree(regclass), that returns all partitions in
> the partition tree whose root is passed as the parameter?
>
> Perhaps, like the following (roughly implemented in the attached)?
>
> select  pg_partition_root(p) as root_parent,
>         pg_partition_parent(p) as parent,
>         p as relname,
>         pg_total_relation_size(p) as size
> from    pg_partition_tree_tables('p') p
> order by 4;
>  root_parent | parent | relname |  size
> -------------+--------+---------+---------
>  p           |        | p       |       0
>  p           | p      | p123    |       0
>  p           | p123   | p12     |       0
>  p           | p123   | p3      | 3653632
>  p           | p12    | p1      | 3653632
>  p           | p12    | p2      | 3653632
> (6 rows)

It seems that you caught the idea. As long as each leaf and root is
listed uniquely, that would be acceptable to me, and useful for users.
If pg_partition_tree_tables() uses the top of the partition as input,
all the tree should show up. If you use a leaf, anything under the leaf
should show up. If a leaf is defined and it has no underlying leaves,
then only this outmost leaf should be listed.

+/*
+ * Returns Oids of tables in a publication.
+ */
Close enough, but that's not a publication.

>> Documentation, as well as regression tests, would be welcome :)
>
> OK, I will add those things in the next version.

Thanks.
--
Michael

Attachment

Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
David Rowley
Date:
On 20 January 2018 at 23:14, Michael Paquier <michael.paquier@gmail.com> wrote:
> If pg_partition_tree_tables() uses the top of the partition as input,
> all the tree should show up. If you use a leaf, anything under the leaf
> should show up. If a leaf is defined and it has no underlying leaves,
> then only this outmost leaf should be listed.

hmm, that's thoroughly confusing. Just in case anyone else is stuck on
that, I just need to mention that a leaf is the does not have
branches, in nature or computer science.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Michael Paquier
Date:
On Sun, Jan 21, 2018 at 07:16:38PM +1300, David Rowley wrote:
> On 20 January 2018 at 23:14, Michael Paquier <michael.paquier@gmail.com> wrote:
>> If pg_partition_tree_tables() uses the top of the partition as input,
>> all the tree should show up. If you use a leaf, anything under the leaf
>> should show up. If a leaf is defined and it has no underlying leaves,
>> then only this outmost leaf should be listed.
>
> hmm, that's thoroughly confusing. Just in case anyone else is stuck on
> that, I just need to mention that a leaf is the does not have
> branches, in nature or computer science.

OK, sorry if my words are confusing. Imagine that you have the following
partition set:

       p
      / \
     /   \
    p1   p2
        /  \
       /    \
      p21   p22

If pg_partition_tree_tables() uses 'p' as input argument, then I would
imagine that it should return p, p1, p2, p21 and p22. If 'p2' is used,
then p2, p21 and p22 are the results. If either one of p1, p21 or p22 is
used as input, then the result is respectively p1, p21 or p22.

Amit's patch seems to be doing that kind of logic by using
find_all_inheritors, which is good. We need to make the difference
between relations that are part of a partition set and the other ones
which are part of an INHERIT link, and, at least it seems to me, the
patch is not careful with that. I haven't tested what is proposed
though, but this portion likely needs more thoughts.
--
Michael

Attachment

Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Amit Langote
Date:
On 2018/01/22 11:44, Michael Paquier wrote:
> On Sun, Jan 21, 2018 at 07:16:38PM +1300, David Rowley wrote:
>> On 20 January 2018 at 23:14, Michael Paquier <michael.paquier@gmail.com> wrote:
>>> If pg_partition_tree_tables() uses the top of the partition as input,
>>> all the tree should show up. If you use a leaf, anything under the leaf
>>> should show up. If a leaf is defined and it has no underlying leaves,
>>> then only this outmost leaf should be listed.
>>
>> hmm, that's thoroughly confusing. Just in case anyone else is stuck on
>> that, I just need to mention that a leaf is the does not have
>> branches, in nature or computer science.
> 
> OK, sorry if my words are confusing. Imagine that you have the following
> partition set:
> 
>        p
>       / \
>      /   \
>     p1   p2
>         /  \
>        /    \
>       p21   p22
> 
> If pg_partition_tree_tables() uses 'p' as input argument, then I would
> imagine that it should return p, p1, p2, p21 and p22. If 'p2' is used,
> then p2, p21 and p22 are the results. If either one of p1, p21 or p22 is
> used as input, then the result is respectively p1, p21 or p22.
> 
> Amit's patch seems to be doing that kind of logic by using
> find_all_inheritors, which is good. We need to make the difference
> between relations that are part of a partition set and the other ones
> which are part of an INHERIT link, and, at least it seems to me, the
> patch is not careful with that. I haven't tested what is proposed
> though, but this portion likely needs more thoughts.

Yeah, I think I completely missed that part.

I wonder what pg_partition_tree_tables() should return when passed a table
that doesn't have partitions under it?  Return a 1-member set containing
itself?  I also mean for tables that may inheritance children established
through plain old inheritance.

Thanks,
Amit



Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Michael Paquier
Date:
On Fri, Jan 26, 2018 at 07:00:43PM +0900, Amit Langote wrote:
> I wonder what pg_partition_tree_tables() should return when passed a table
> that doesn't have partitions under it?  Return a 1-member set containing
> itself?

Yes.  A table alone is itself part of a partition set, so the result
should be made of only itself.

> I also mean for tables that may inheritance children established
> through plain old inheritance.

There could be value in having a version dedicated to inheritance trees
as well, true enough.  As well as value in having something that shows
both.  Still let's not forget that partition sets are structured so as
the parents have no data, so I see more value in having only partitions
listed, without the INHERIT part.  Opinions from others are of course
welcome.
--
Michael

Attachment

Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Robert Haas
Date:
On Fri, Jan 26, 2018 at 7:45 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> There could be value in having a version dedicated to inheritance trees
> as well, true enough.  As well as value in having something that shows
> both.  Still let's not forget that partition sets are structured so as
> the parents have no data, so I see more value in having only partitions
> listed, without the INHERIT part.  Opinions from others are of course
> welcome.

Well, partitioning and inheritance can't be mixed.  A given table has
either partitions OR inheritance children OR neither.  If it has
either partitions or inheritance children, find_all_inheritors will
return them.  Otherwise, I think it'll just return the input OID
itself.  So I don't quite see, if we're going to add a convenience
function here, why wouldn't just define it to return the same results
as find_all_inheritors does?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Amit Langote
Date:
On 2018/01/27 3:32, Robert Haas wrote:
> On Fri, Jan 26, 2018 at 7:45 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> There could be value in having a version dedicated to inheritance trees
>> as well, true enough.  As well as value in having something that shows
>> both.  Still let's not forget that partition sets are structured so as
>> the parents have no data, so I see more value in having only partitions
>> listed, without the INHERIT part.  Opinions from others are of course
>> welcome.
> 
> Well, partitioning and inheritance can't be mixed.  A given table has
> either partitions OR inheritance children OR neither.

That's true.

> If it has
> either partitions or inheritance children, find_all_inheritors will
> return them.  Otherwise, I think it'll just return the input OID
> itself.  So I don't quite see, if we're going to add a convenience
> function here, why wouldn't just define it to return the same results
> as find_all_inheritors does?

So if all we're doing is trying to make find_all_inheritors() accessible
to users through SQL, it makes sense to call it
pg_get_inheritance_tables() rather than pg_partition_tree_tables().

Thanks,
Amit



Re: [Sender Address Forgery]Re: pg_(total_)relation_size andpartitioned tables

From
Michael Paquier
Date:
On Mon, Jan 29, 2018 at 10:08:51AM +0900, Amit Langote wrote:
> On 2018/01/27 3:32, Robert Haas wrote:
>> If it has
>> either partitions or inheritance children, find_all_inheritors will
>> return them.  Otherwise, I think it'll just return the input OID
>> itself.  So I don't quite see, if we're going to add a convenience
>> function here, why wouldn't just define it to return the same results
>> as find_all_inheritors does?
>
> So if all we're doing is trying to make find_all_inheritors() accessible
> to users through SQL, it makes sense to call it
> pg_get_inheritance_tables() rather than pg_partition_tree_tables().

I was looking again at this stuff this morning, noticing that
find_all_inheritors() is particularly used in
check_default_allows_bound() for partitions, so complaint withdrawn.
The renaming as you propose here looks sensible as well.
--
Michael

Attachment