Thread: [HACKERS] Relpartbound, toasting and pg_class

[HACKERS] Relpartbound, toasting and pg_class

From
Andres Freund
Date:
Hi,

Just noticed that pg_class now has several varlena fields:
#ifdef CATALOG_VARLEN            /* variable-length fields start here *//* NOTE: These fields are not present in a
relcacheentry's rd_rel field. */aclitem        relacl[1];        /* access permissions */text        reloptions[1];
/*access-method-specific options */pg_node_tree relpartbound;    /* partition bound node tree */
 
#endif

of those relpartbound is fairly new. And pretty much unbounded in
size. Aren't we going to run into issues because pg_class doesn't have a
toast table? It's quite reasonable to use a multi-field composite type
as a partition boundary...

Greetings,

Andres Freund



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Peter Eisentraut
Date:
On 6/12/17 15:38, Andres Freund wrote:
> Just noticed that pg_class now has several varlena fields:
> #ifdef CATALOG_VARLEN            /* variable-length fields start here */
>     /* NOTE: These fields are not present in a relcache entry's rd_rel field. */
>     aclitem        relacl[1];        /* access permissions */
>     text        reloptions[1];    /* access-method-specific options */
>     pg_node_tree relpartbound;    /* partition bound node tree */
> #endif
> 
> of those relpartbound is fairly new. And pretty much unbounded in
> size. Aren't we going to run into issues because pg_class doesn't have a
> toast table? It's quite reasonable to use a multi-field composite type
> as a partition boundary...

Cases where relacl became too large have been known to exist.  I'm not
sure whether relpartbound can really be that large to change the
scenario significantly.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Andres Freund
Date:
On 2017-06-12 17:10:28 -0400, Peter Eisentraut wrote:
> On 6/12/17 15:38, Andres Freund wrote:
> > Just noticed that pg_class now has several varlena fields:
> > #ifdef CATALOG_VARLEN            /* variable-length fields start here */
> >     /* NOTE: These fields are not present in a relcache entry's rd_rel field. */
> >     aclitem        relacl[1];        /* access permissions */
> >     text        reloptions[1];    /* access-method-specific options */
> >     pg_node_tree relpartbound;    /* partition bound node tree */
> > #endif
> > 
> > of those relpartbound is fairly new. And pretty much unbounded in
> > size. Aren't we going to run into issues because pg_class doesn't have a
> > toast table? It's quite reasonable to use a multi-field composite type
> > as a partition boundary...
> 
> Cases where relacl became too large have been known to exist.  I'm not
> sure whether relpartbound can really be that large to change the
> scenario significantly.

Because it's further increasing the size by something unbounded in size,
which'll not uncommonly be large? It makes a fair amount of sense to
partition by multiple columns at once (using the expression syntax).

- Andres



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2017-06-12 17:10:28 -0400, Peter Eisentraut wrote:
>> Cases where relacl became too large have been known to exist.  I'm not
>> sure whether relpartbound can really be that large to change the
>> scenario significantly.

> Because it's further increasing the size by something unbounded in size,
> which'll not uncommonly be large? It makes a fair amount of sense to
> partition by multiple columns at once (using the expression syntax).

How about gathering some actual evidence on the point --- ie, how big
a partition expression do you need to make it fall over?
        regards, tom lane



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-06-12 17:10:28 -0400, Peter Eisentraut wrote:
> >> Cases where relacl became too large have been known to exist.  I'm not
> >> sure whether relpartbound can really be that large to change the
> >> scenario significantly.
> 
> > Because it's further increasing the size by something unbounded in size,
> > which'll not uncommonly be large? It makes a fair amount of sense to
> > partition by multiple columns at once (using the expression syntax).
> 
> How about gathering some actual evidence on the point --- ie, how big
> a partition expression do you need to make it fall over?

You'd need a 2kB expression (after compression) in
relpartbound before you hit a problem here.  I wouldn't worry about it
at this stage ...

Not on point, but this conversation reminded me of
https://www.commandprompt.com/blog/grant_schema_usage_to_2500_users_no_can_do/
wherein you needed 2500 roles in an ACL column before it became a
problem -- and the project's stance is not to bother supporting that
case.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> How about gathering some actual evidence on the point --- ie, how big
>> a partition expression do you need to make it fall over?

> You'd need a 2kB expression (after compression) in
> relpartbound before you hit a problem here.  I wouldn't worry about it
> at this stage ...

Actually, as long as the expression was less than ~8KB after compression,
it'd work.  But I don't have a clear idea of complex an expression that
really is --- we've never made much of an effort to make the outfuncs.c
representation compact, so maybe there's an issue here?  As I said,
it'd be worthwhile checking some actual examples rather than guessing.

> Not on point, but this conversation reminded me of
> https://www.commandprompt.com/blog/grant_schema_usage_to_2500_users_no_can_do/
> wherein you needed 2500 roles in an ACL column before it became a
> problem -- and the project's stance is not to bother supporting that
> case.

Quite on point really.  But there we knew how many entries it took to
break it, and we also knew that good practice wouldn't hit the problem
because you'd use groups instead of a lot of individual ACL entries.
I don't think we're in a position yet to just dismiss this question.
        regards, tom lane



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Andres Freund
Date:
On 2017-06-12 18:10:52 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >> How about gathering some actual evidence on the point --- ie, how big
> >> a partition expression do you need to make it fall over?
> 
> > You'd need a 2kB expression (after compression) in
> > relpartbound before you hit a problem here.  I wouldn't worry about it
> > at this stage ...
> 
> Actually, as long as the expression was less than ~8KB after compression,
> it'd work.  But I don't have a clear idea of complex an expression that
> really is --- we've never made much of an effort to make the outfuncs.c
> representation compact, so maybe there's an issue here?  As I said,
> it'd be worthwhile checking some actual examples rather than guessing.

It's indeed not very compact.  E.g a simple example with small types:

CREATE TABLE partitioned(a int, b int, c timestamptz, data text) PARTITION BY RANGE (a, b, c);
CREATE TABLE partitioned_child1 PARTITION OF partitioned FOR VALUES FROM (1, 1, '2017-01-01') TO (1, 1, '2017-02-01');

And with LIST style partitioning it'd be quite reasonable to have
significantly longer IN() lists, no?  Compression will save us to some
degree here, but it's not going super far, especially with pglz.  I
think we have some hope of compressing out some of the serialization
overhead, but not more.

postgres[6961][1]=# SELECT relname, pg_column_size(relpartbound), length(relpartbound), pg_column_size(pg_class) FROM
pg_classWHERE relpartbound IS NOT NULL;
 
┌───────────────────────────────┬────────────────┬────────┬────────────────┐
│            relname            │ pg_column_size │ length │ pg_column_size │
├───────────────────────────────┼────────────────┼────────┼────────────────┤
│ partitioned_child1            │           1355 │   1351 │           1523 │
│ partitioneded_list_committers │           1130 │   8049 │           1298 │
└───────────────────────────────┴────────────────┴────────┴────────────────┘

We can see in the latter, which just is a LIST partition with every
committers name & username, that compression helps, but in the earlier
example from above it doesn't.


> > Not on point, but this conversation reminded me of
> > https://www.commandprompt.com/blog/grant_schema_usage_to_2500_users_no_can_do/
> > wherein you needed 2500 roles in an ACL column before it became a
> > problem -- and the project's stance is not to bother supporting that
> > case.
> 
> Quite on point really.  But there we knew how many entries it took to
> break it, and we also knew that good practice wouldn't hit the problem
> because you'd use groups instead of a lot of individual ACL entries.
> I don't think we're in a position yet to just dismiss this question.

Yea, I don't think those are entirely comparable.  I'm also not sure
it was actually the right decision back then.

Greetings,

Andres Freund



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2017-06-12 18:10:52 -0400, Tom Lane wrote:
>> it'd be worthwhile checking some actual examples rather than guessing.

> It's indeed not very compact.  E.g a simple example with small types:

> CREATE TABLE partitioned(a int, b int, c timestamptz, data text) PARTITION BY RANGE (a, b, c);
> CREATE TABLE partitioned_child1 PARTITION OF partitioned FOR VALUES FROM (1, 1, '2017-01-01') TO (1, 1,
'2017-02-01');

> postgres[6961][1]=# SELECT relname, pg_column_size(relpartbound), length(relpartbound), pg_column_size(pg_class) FROM
pg_classWHERE relpartbound IS NOT NULL; 
> ┌───────────────────────────────┬────────────────┬────────┬────────────────┐
> │            relname            │ pg_column_size │ length │ pg_column_size │
> ├───────────────────────────────┼────────────────┼────────┼────────────────┤
> │ partitioned_child1            │           1355 │   1351 │           1523 │
> │ partitioneded_list_committers │           1130 │   8049 │           1298 │
> └───────────────────────────────┴────────────────┴────────┴────────────────┘

So, counting on my fingers, you'd need something like twenty partitioning
columns before you hit trouble with the RANGE syntax.  I'm willing to live
with that, especially since that's *before* compression.  (Your example
does not show that compression was ineffective; more likely it wasn't
tried, since the pg_class tuple was under 2K.)

The LIST case might be more of a problem, but I'm not sure.  It looks like
that eats circa 150 bytes per value in outfuncs.c format, but they're
*very* repetitive and compress really well.  I get about 16 stored bytes
per value with a long list of integer keys, so it looks like you could
approach 500 values in the LIST before hitting trouble.  Maybe a few less
with wider datatypes.

On the whole, I'm inclined to agree with Peter and Alvaro that this is
fine, at least for the short term.  Even in the long term, I doubt we
need toastability, just a more compact representation than an expression
tree.  bytea storage of an array, perhaps?  Or maybe better, use anyarray
like we do in pg_statistic, so that it prints legibly.
        regards, tom lane



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Andres Freund
Date:
On 2017-06-12 19:00:02 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-06-12 18:10:52 -0400, Tom Lane wrote:
> >> it'd be worthwhile checking some actual examples rather than guessing.
> 
> > It's indeed not very compact.  E.g a simple example with small types:
> 
> > CREATE TABLE partitioned(a int, b int, c timestamptz, data text) PARTITION BY RANGE (a, b, c);
> > CREATE TABLE partitioned_child1 PARTITION OF partitioned FOR VALUES FROM (1, 1, '2017-01-01') TO (1, 1,
'2017-02-01');
> 
> > postgres[6961][1]=# SELECT relname, pg_column_size(relpartbound), length(relpartbound), pg_column_size(pg_class)
FROMpg_class WHERE relpartbound IS NOT NULL;
 
> > ┌───────────────────────────────┬────────────────┬────────┬────────────────┐
> > │            relname            │ pg_column_size │ length │ pg_column_size │
> > ├───────────────────────────────┼────────────────┼────────┼────────────────┤
> > │ partitioned_child1            │           1355 │   1351 │           1523 │
> > │ partitioneded_list_committers │           1130 │   8049 │           1298 │
> > └───────────────────────────────┴────────────────┴────────┴────────────────┘
> 
> So, counting on my fingers, you'd need something like twenty partitioning
> columns before you hit trouble with the RANGE syntax.

Well, that's with 4/8 byte wide types.  I'd be surprised if people only
ever used those.  I'd bet quite a bit that people will start using
jsonb, postgis' geometry and such as partition types, even if it makes
most of us cringe.


> On the whole, I'm inclined to agree with Peter and Alvaro that this is
> fine, at least for the short term.  Even in the long term, I doubt we
> need toastability, just a more compact representation than an expression
> tree.  bytea storage of an array, perhaps?  Or maybe better, use anyarray
> like we do in pg_statistic, so that it prints legibly.

I'm not quite sure where the aversion to adding a toast table to
pg_class is coming from?  Why are we ok with arbitrary and hard to
understand restrictions here, and not elsewhere?

Greetings,

Andres Freund



Re: [HACKERS] Relpartbound, toasting and pg_class

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> I'm not quite sure where the aversion to adding a toast table to
> pg_class is coming from?

I'm not at all sure it would work, and would rather not introduce
risks of infinite recursion if they're not necessary.
        regards, tom lane