Thread: table inheritance versus column compression and storage settings

table inheritance versus column compression and storage settings

From
Peter Eisentraut
Date:
I found the way that MergeAttributes() handles column compression and 
storage settings very weird.

For example, I don't understand the purpose of this error:

create table t1 (a int, b text compression pglz);

create table t1a (b text compression lz4) inherits (t1);
...
ERROR:  42804: column "b" has a compression method conflict
DETAIL:  pglz versus lz4

or this:

create table t2 (a int, b text compression lz4);

create table t12 () inherits (t1, t2);
...
ERROR:  column "b" has a compression method conflict
DETAIL:  pglz versus lz4

And we can't override it in the child, per the first example.

But this works:

create table t1a (a int, b text compression lz4);
alter table t1a inherit t1;

Also, you can change the settings in the child using ALTER TABLE ... SET 
COMPRESSION (which is also how pg_dump will represent the above 
constructions), so the restrictions at CREATE TABLE time don't seem to 
make much sense.

Looking at the code, I suspect these rules were just sort of 
copy-and-pasted from the nearby rules for types and collations.  The 
latter are needed so that a table with inheritance children can present 
a logically consistent view of the data.  But compression and storage 
are physical properties that are not logically visible, so every table 
in an inheritance hierarchy can have their own setting.

I think the rules should be approximately like this (both for 
compression and storage):

- A newly created child inherits the settings from the parent.
- A newly created child can override the settings.
- Attaching a child changes nothing.
- When inheriting from multiple parents with different settings, an 
explicit setting in the child is required.

Thoughts?



Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Mon, Dec 4, 2023 at 4:23 PM Peter Eisentraut <peter@eisentraut.org> wrote:

>
> Looking at the code, I suspect these rules were just sort of
> copy-and-pasted from the nearby rules for types and collations.  The
> latter are needed so that a table with inheritance children can present
> a logically consistent view of the data.  But compression and storage
> are physical properties that are not logically visible, so every table
> in an inheritance hierarchy can have their own setting.

Incidentally I was looking at that code yesterday and had the same thoughts.

>
> I think the rules should be approximately like this (both for
> compression and storage):
>
> - A newly created child inherits the settings from the parent.
> - A newly created child can override the settings.
> - Attaching a child changes nothing.

Looks fine to me.

> - When inheriting from multiple parents with different settings, an
> explicit setting in the child is required.

When no explicit setting for child is specified, it will throw an
error as it does today. Right?

--
Best Wishes,
Ashutosh Bapat



Re: table inheritance versus column compression and storage settings

From
Peter Eisentraut
Date:
On 05.12.23 05:26, Ashutosh Bapat wrote:
>> - When inheriting from multiple parents with different settings, an
>> explicit setting in the child is required.
> When no explicit setting for child is specified, it will throw an
> error as it does today. Right?

Yes, it would throw an error, but a different error than today, saying 
something like "the settings in the parents conflict, so you need to 
specify one here to override the conflict".




Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Tue, Dec 5, 2023 at 6:28 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 05.12.23 05:26, Ashutosh Bapat wrote:
> >> - When inheriting from multiple parents with different settings, an
> >> explicit setting in the child is required.
> > When no explicit setting for child is specified, it will throw an
> > error as it does today. Right?
>
> Yes, it would throw an error, but a different error than today, saying
> something like "the settings in the parents conflict, so you need to
> specify one here to override the conflict".
>

PFA patch fixing inheritance and compression. It also fixes a crash
reported in [1].

The storage looks more involved. The way it has been coded, the child
always inherits the parent's storage properties.
#create table t1 (a text storage plain);
CREATE TABLE
#create table c1 (b text storage main) inherits(t1);
CREATE TABLE
#create table c1 (a text storage main) inherits(t1);
NOTICE:  merging column "a" with inherited definition
CREATE TABLE
#\d+ t1
                                          Table "public.t1"
 Column | Type | Collation | Nullable | Default | Storage |
Compression | Stats target | Description
--------+------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | text |           |          |         | plain   |
 |              |
Child tables: c1
Access method: heap
#\d+ c1
                                          Table "public.c1"
 Column | Type | Collation | Nullable | Default | Storage |
Compression | Stats target | Description
--------+------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | text |           |          |         | plain   |
 |              |
Inherits: t1
Access method: heap

Observe that c1.a did not have storage "main" but instead inherits
"plain" from t1.

According to the code at

https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L3253,
there is supposed to be a conflict error. But that does not happen
since child's storage specification is in ColumnDef::storage_name
which is never consulted. The ColumnDef::storage_name is converted to
ColumnDef::storage only in BuildDescForRelation(), after
MergeAttribute() has been finished. There are two ways to fix this
1. In MergeChildAttribute() resolve ColumnDef::storage_name to
ColumnDef::storage before comparing it against inherited property. I
don't like this approach since a. we duplicate the conversion logic in
MergeChildAttribute() and BuildDescForRelation(), b. the conversion
happens only for the attributes which are inherited.

2. Deal with it the same way as compression. Get rid of
ColumnDef::storage altogether. Instead set ColumnDef::storage_name at

https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L2723.

I am inclined to take the second approach. Let me know if you feel otherwise.

[1] https://www.postgresql.org/message-id/b22a6834-aacb-7b18-0424-a3f5fe889667@gmail.com

--
Best Wishes,
Ashutosh Bapat



Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Wed, Jan 31, 2024 at 1:29 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Dec 5, 2023 at 6:28 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> >
> > On 05.12.23 05:26, Ashutosh Bapat wrote:
> > >> - When inheriting from multiple parents with different settings, an
> > >> explicit setting in the child is required.
> > > When no explicit setting for child is specified, it will throw an
> > > error as it does today. Right?
> >
> > Yes, it would throw an error, but a different error than today, saying
> > something like "the settings in the parents conflict, so you need to
> > specify one here to override the conflict".
> >
>
> PFA patch fixing inheritance and compression. It also fixes a crash
> reported in [1].
>
> The storage looks more involved. The way it has been coded, the child
> always inherits the parent's storage properties.
> #create table t1 (a text storage plain);
> CREATE TABLE
> #create table c1 (b text storage main) inherits(t1);
> CREATE TABLE
> #create table c1 (a text storage main) inherits(t1);
> NOTICE:  merging column "a" with inherited definition
> CREATE TABLE
> #\d+ t1
>                                           Table "public.t1"
>  Column | Type | Collation | Nullable | Default | Storage |
> Compression | Stats target | Description
> --------+------+-----------+----------+---------+---------+-------------+--------------+-------------
>  a      | text |           |          |         | plain   |
>  |              |
> Child tables: c1
> Access method: heap
> #\d+ c1
>                                           Table "public.c1"
>  Column | Type | Collation | Nullable | Default | Storage |
> Compression | Stats target | Description
> --------+------+-----------+----------+---------+---------+-------------+--------------+-------------
>  a      | text |           |          |         | plain   |
>  |              |
> Inherits: t1
> Access method: heap
>
> Observe that c1.a did not have storage "main" but instead inherits
> "plain" from t1.
>
> According to the code at
>
https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L3253,
> there is supposed to be a conflict error. But that does not happen
> since child's storage specification is in ColumnDef::storage_name
> which is never consulted. The ColumnDef::storage_name is converted to
> ColumnDef::storage only in BuildDescForRelation(), after
> MergeAttribute() has been finished. There are two ways to fix this
> 1. In MergeChildAttribute() resolve ColumnDef::storage_name to
> ColumnDef::storage before comparing it against inherited property. I
> don't like this approach since a. we duplicate the conversion logic in
> MergeChildAttribute() and BuildDescForRelation(), b. the conversion
> happens only for the attributes which are inherited.
>
> 2. Deal with it the same way as compression. Get rid of
> ColumnDef::storage altogether. Instead set ColumnDef::storage_name at
>
https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L2723.
>
> I am inclined to take the second approach. Let me know if you feel otherwise.

Took the second approach. PFA patches
0001 fixes compression inheritance
0002 fixes storage inheritance

The patches may be committed separately or as a single patch. Keeping
them separate in case we decide to commit one but not the other.

We always set storage even if it's not specified, in which case the
column type's default storage is used. This is slightly different from
compression which defaults to the GUC's value if not set. This has led
to slight difference in the tests.

--
Best Wishes,
Ashutosh Bapat

Attachment

Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Wed, Feb 7, 2024 at 12:47 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

> 0001 fixes compression inheritance
> 0002 fixes storage inheritance
>

The first patch does not update compression_1.out which makes CI
unhappy. Here's patchset fixing that.


--
Best Wishes,
Ashutosh Bapat

Attachment

Re: table inheritance versus column compression and storage settings

From
Peter Eisentraut
Date:
On 08.02.24 08:20, Ashutosh Bapat wrote:
> On Wed, Feb 7, 2024 at 12:47 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> 
>> 0001 fixes compression inheritance
>> 0002 fixes storage inheritance
>>
> 
> The first patch does not update compression_1.out which makes CI
> unhappy. Here's patchset fixing that.

The changed behavior looks good to me.  The tests are good, the code 
changes are pretty straightforward.

Did you by any change check that pg_dump dumps the resulting structures 
correctly?  I notice in tablecmds.c that ALTER COLUMN SET STORAGE 
recurses but ALTER COLUMN SET COMPRESSION does not.  I don't understand 
why that is, and I wonder whether it affects pg_dump.




Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Mon, Feb 12, 2024 at 8:48 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 08.02.24 08:20, Ashutosh Bapat wrote:
> > On Wed, Feb 7, 2024 at 12:47 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> >
> >> 0001 fixes compression inheritance
> >> 0002 fixes storage inheritance
> >>
> >
> > The first patch does not update compression_1.out which makes CI
> > unhappy. Here's patchset fixing that.
>
> The changed behavior looks good to me.  The tests are good, the code
> changes are pretty straightforward.
>
> Did you by any change check that pg_dump dumps the resulting structures
> correctly?  I notice in tablecmds.c that ALTER COLUMN SET STORAGE
> recurses but ALTER COLUMN SET COMPRESSION does not.  I don't understand
> why that is, and I wonder whether it affects pg_dump.
>

I used src/bin/pg_upgrade/t/002_pg_upgrade.pl to test dump and restore
by leaving back the new objects created in compression.sql and
inherit.sql.

COMPRESSION is set using ALTER TABLE ONLY so it affects only the
parent and should not propagate to children. A child inherits the
parent first and then changes compression property. For example
```
CREATE TABLE public.cmparent1 (
    f1 text
);
ALTER TABLE ONLY public.cmparent1 ALTER COLUMN f1 SET COMPRESSION pglz;

CREATE TABLE public.cminh1 (
    f1 text
)
INHERITS (public.cmparent1);
ALTER TABLE ONLY public.cminh1 ALTER COLUMN f1 SET COMPRESSION lz4;
```

Same is true with the STORAGE parameter. Example
```
CREATE TABLE public.stparent1 (
    a text
);
ALTER TABLE ONLY public.stparent1 ALTER COLUMN a SET STORAGE PLAIN;

CREATE TABLE public.stchild1 (
    a text
)
INHERITS (public.stparent1);
ALTER TABLE ONLY public.stchild1 ALTER COLUMN a SET STORAGE PLAIN;
```

I don't think pg_dump would be affected by the difference you noted.

--
Best Wishes,
Ashutosh Bapat



Re: table inheritance versus column compression and storage settings

From
Peter Eisentraut
Date:
I have committed this.  It is great to get this behavior fixed and also 
to get the internals more consistent.  Thanks.



Peter Eisentraut <peter@eisentraut.org> writes:
> I have committed this.  It is great to get this behavior fixed and also 
> to get the internals more consistent.  Thanks.

I find it surprising that the committed patch does not touch
pg_dump.  Is it really true that pg_dump dumps situations with
differing compression/storage settings accurately already?

(Note that it proves little that the pg_upgrade test passes,
since if pg_dump were blind to the settings applicable to a
child table, the second dump would still be blind to them.)

            regards, tom lane



I wrote:
> I find it surprising that the committed patch does not touch
> pg_dump.  Is it really true that pg_dump dumps situations with
> differing compression/storage settings accurately already?

It's worse than I thought.  Run "make installcheck" with
today's HEAD, then:

$ pg_dump -Fc regression >r.dump
$ createdb r2
$ pg_restore -d r2 r.dump
pg_restore: error: could not execute query: ERROR:  column "a" inherits conflicting storage methods
HINT:  To resolve the conflict, specify a storage method explicitly.
Command was: CREATE TABLE public.stchild4 (
    a text
)
INHERITS (public.stparent1, public.stparent2);
ALTER TABLE ONLY public.stchild4 ALTER COLUMN a SET STORAGE MAIN;


pg_restore: error: could not execute query: ERROR:  relation "public.stchild4" does not exist
Command was: ALTER TABLE public.stchild4 OWNER TO postgres;

pg_restore: error: could not execute query: ERROR:  relation "public.stchild4" does not exist
Command was: COPY public.stchild4 (a) FROM stdin;
pg_restore: warning: errors ignored on restore: 3


What I'd intended to compare was the results of the query added to the
regression tests:

regression=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
WHERE (attrelid::regclass::name like 'stparent%'
OR attrelid::regclass::name like 'stchild%')
and attname = 'a'
ORDER BY 1, 2;
 attrelid  | attname | attstorage
-----------+---------+------------
 stparent1 | a       | p
 stparent2 | a       | x
 stchild1  | a       | p
 stchild3  | a       | m
 stchild4  | a       | m
 stchild5  | a       | x
 stchild6  | a       | m
(7 rows)

r2=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
WHERE (attrelid::regclass::name like 'stparent%'
OR attrelid::regclass::name like 'stchild%')
and attname = 'a'
ORDER BY 1, 2;
 attrelid  | attname | attstorage
-----------+---------+------------
 stparent1 | a       | p
 stchild1  | a       | p
 stchild3  | a       | m
 stparent2 | a       | x
 stchild5  | a       | p
 stchild6  | a       | m
(6 rows)

So not only does stchild4 fail to restore altogether, but stchild5
ends with the wrong attstorage.

This patch definitely needs more work.

            regards, tom lane



Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Fri, Feb 16, 2024 at 11:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > I find it surprising that the committed patch does not touch
> > pg_dump.  Is it really true that pg_dump dumps situations with
> > differing compression/storage settings accurately already?
>
> It's worse than I thought.  Run "make installcheck" with
> today's HEAD, then:
>
> $ pg_dump -Fc regression >r.dump
> $ createdb r2
> $ pg_restore -d r2 r.dump
> pg_restore: error: could not execute query: ERROR:  column "a" inherits conflicting storage methods
> HINT:  To resolve the conflict, specify a storage method explicitly.
> Command was: CREATE TABLE public.stchild4 (
>     a text
> )
> INHERITS (public.stparent1, public.stparent2);
> ALTER TABLE ONLY public.stchild4 ALTER COLUMN a SET STORAGE MAIN;
>
>
> pg_restore: error: could not execute query: ERROR:  relation "public.stchild4" does not exist
> Command was: ALTER TABLE public.stchild4 OWNER TO postgres;
>
> pg_restore: error: could not execute query: ERROR:  relation "public.stchild4" does not exist
> Command was: COPY public.stchild4 (a) FROM stdin;
> pg_restore: warning: errors ignored on restore: 3

Thanks for the test. Let's call this Problem1. I expected
src/bin/pg_upgrade/t/002_pg_upgrade.pl to fail in this case since it
will execute similar steps as you did. And it actually does, except
that it uses binary-upgrade mode. In that mode, INHERITed tables are
dumped in a different manner
-- For binary upgrade, set up inheritance this way.
ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent1";
ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent2";
... snip ...
ALTER TABLE ONLY "public"."stchild4" ALTER COLUMN "a" SET STORAGE MAIN;

that does not lead to the conflict and pg_upgrade does not fail.

>
>
> What I'd intended to compare was the results of the query added to the
> regression tests:
>
> regression=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
> WHERE (attrelid::regclass::name like 'stparent%'
> OR attrelid::regclass::name like 'stchild%')
> and attname = 'a'
> ORDER BY 1, 2;
>  attrelid  | attname | attstorage
> -----------+---------+------------
>  stparent1 | a       | p
>  stparent2 | a       | x
>  stchild1  | a       | p
>  stchild3  | a       | m
>  stchild4  | a       | m
>  stchild5  | a       | x
>  stchild6  | a       | m
> (7 rows)
>
> r2=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
> WHERE (attrelid::regclass::name like 'stparent%'
> OR attrelid::regclass::name like 'stchild%')
> and attname = 'a'
> ORDER BY 1, 2;
>  attrelid  | attname | attstorage
> -----------+---------+------------
>  stparent1 | a       | p
>  stchild1  | a       | p
>  stchild3  | a       | m
>  stparent2 | a       | x
>  stchild5  | a       | p
>  stchild6  | a       | m
> (6 rows)
>
> So not only does stchild4 fail to restore altogether, but stchild5
> ends with the wrong attstorage.

With binary-upgrade dump and restore stchild5 gets the correct storage value.

Looks like we need a test which pg_dump s regression database and
restores it without going through pg_upgrade.

I think the fix is easy one. Dump the STORAGE and COMPRESSION clauses
with CREATE TABLE for local attributes. Those for inherited attributes
will be dumped separately.

But that will not fix an existing problem described below. Let's call
it Problem2. With HEAD  at commit
57f59396bb51953bb7b957780c7f1b7f67602125 (almost a month back)
$ createdb regression
$ psql -d regression
#create table par1 (a text storage plain);
#create table par2 (a text storage plain);
#create table chld (a text) inherits (par1, par2);
NOTICE:  merging multiple inherited definitions of column "a"
NOTICE:  merging column "a" with inherited definition
-- parent storages conflict after child creation
#alter table par1 alter column a set storage extended;
#SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
 WHERE (attrelid::regclass::name like 'par%'
 OR attrelid::regclass::name like 'chld%')
 and attname = 'a'
 ORDER BY 1, 2;
 attrelid | attname | attstorage
----------+---------+------------
 par1     | a       | x
 par2     | a       | p
 chld     | a       | x
(3 rows)

$ createdb r2
$ pg_dump -Fc regression > /tmp/r.dump
$ pg_restore -d r2 /tmp/r.dump
pg_restore: error: could not execute query: ERROR:  inherited column
"a" has a storage parameter conflict
DETAIL:  EXTENDED versus PLAIN
Command was: CREATE TABLE public.chld (
    a text
)
INHERITS (public.par1, public.par2);

pg_restore: error: could not execute query: ERROR:  relation
"public.chld" does not exist
Command was: ALTER TABLE public.chld OWNER TO ashutosh;

pg_restore: error: could not execute query: ERROR:  relation
"public.chld" does not exist
Command was: COPY public.chld (a) FROM stdin;
pg_restore: warning: errors ignored on restore: 3

Fixing this requires that we dump ALTER TABLE ... ALTER COLUMN SET
STORAGE and COMPRESSION commands after all the tables (at least
children) have been created. That seems to break the way we dump the
whole table together right now. OR dump inherited tables like binary
upgrade mode.

--
Best Wishes,
Ashutosh Bapat



Re: table inheritance versus column compression and storage settings

From
Peter Eisentraut
Date:
I have reverted the patch for now (and re-opened the commitfest entry). 
We should continue to work on this and see if we can at least try to get 
the pg_dump test coverage suitable.


On 19.02.24 12:34, Ashutosh Bapat wrote:
> On Fri, Feb 16, 2024 at 11:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> I wrote:
>>> I find it surprising that the committed patch does not touch
>>> pg_dump.  Is it really true that pg_dump dumps situations with
>>> differing compression/storage settings accurately already?
>>
>> It's worse than I thought.  Run "make installcheck" with
>> today's HEAD, then:
>>
>> $ pg_dump -Fc regression >r.dump
>> $ createdb r2
>> $ pg_restore -d r2 r.dump
>> pg_restore: error: could not execute query: ERROR:  column "a" inherits conflicting storage methods
>> HINT:  To resolve the conflict, specify a storage method explicitly.
>> Command was: CREATE TABLE public.stchild4 (
>>      a text
>> )
>> INHERITS (public.stparent1, public.stparent2);
>> ALTER TABLE ONLY public.stchild4 ALTER COLUMN a SET STORAGE MAIN;
>>
>>
>> pg_restore: error: could not execute query: ERROR:  relation "public.stchild4" does not exist
>> Command was: ALTER TABLE public.stchild4 OWNER TO postgres;
>>
>> pg_restore: error: could not execute query: ERROR:  relation "public.stchild4" does not exist
>> Command was: COPY public.stchild4 (a) FROM stdin;
>> pg_restore: warning: errors ignored on restore: 3
> 
> Thanks for the test. Let's call this Problem1. I expected
> src/bin/pg_upgrade/t/002_pg_upgrade.pl to fail in this case since it
> will execute similar steps as you did. And it actually does, except
> that it uses binary-upgrade mode. In that mode, INHERITed tables are
> dumped in a different manner
> -- For binary upgrade, set up inheritance this way.
> ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent1";
> ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent2";
> ... snip ...
> ALTER TABLE ONLY "public"."stchild4" ALTER COLUMN "a" SET STORAGE MAIN;
> 
> that does not lead to the conflict and pg_upgrade does not fail.
> 
>>
>>
>> What I'd intended to compare was the results of the query added to the
>> regression tests:
>>
>> regression=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
>> WHERE (attrelid::regclass::name like 'stparent%'
>> OR attrelid::regclass::name like 'stchild%')
>> and attname = 'a'
>> ORDER BY 1, 2;
>>   attrelid  | attname | attstorage
>> -----------+---------+------------
>>   stparent1 | a       | p
>>   stparent2 | a       | x
>>   stchild1  | a       | p
>>   stchild3  | a       | m
>>   stchild4  | a       | m
>>   stchild5  | a       | x
>>   stchild6  | a       | m
>> (7 rows)
>>
>> r2=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
>> WHERE (attrelid::regclass::name like 'stparent%'
>> OR attrelid::regclass::name like 'stchild%')
>> and attname = 'a'
>> ORDER BY 1, 2;
>>   attrelid  | attname | attstorage
>> -----------+---------+------------
>>   stparent1 | a       | p
>>   stchild1  | a       | p
>>   stchild3  | a       | m
>>   stparent2 | a       | x
>>   stchild5  | a       | p
>>   stchild6  | a       | m
>> (6 rows)
>>
>> So not only does stchild4 fail to restore altogether, but stchild5
>> ends with the wrong attstorage.
> 
> With binary-upgrade dump and restore stchild5 gets the correct storage value.
> 
> Looks like we need a test which pg_dump s regression database and
> restores it without going through pg_upgrade.
> 
> I think the fix is easy one. Dump the STORAGE and COMPRESSION clauses
> with CREATE TABLE for local attributes. Those for inherited attributes
> will be dumped separately.
> 
> But that will not fix an existing problem described below. Let's call
> it Problem2. With HEAD  at commit
> 57f59396bb51953bb7b957780c7f1b7f67602125 (almost a month back)
> $ createdb regression
> $ psql -d regression
> #create table par1 (a text storage plain);
> #create table par2 (a text storage plain);
> #create table chld (a text) inherits (par1, par2);
> NOTICE:  merging multiple inherited definitions of column "a"
> NOTICE:  merging column "a" with inherited definition
> -- parent storages conflict after child creation
> #alter table par1 alter column a set storage extended;
> #SELECT attrelid::regclass, attname, attstorage FROM pg_attribute
>   WHERE (attrelid::regclass::name like 'par%'
>   OR attrelid::regclass::name like 'chld%')
>   and attname = 'a'
>   ORDER BY 1, 2;
>   attrelid | attname | attstorage
> ----------+---------+------------
>   par1     | a       | x
>   par2     | a       | p
>   chld     | a       | x
> (3 rows)
> 
> $ createdb r2
> $ pg_dump -Fc regression > /tmp/r.dump
> $ pg_restore -d r2 /tmp/r.dump
> pg_restore: error: could not execute query: ERROR:  inherited column
> "a" has a storage parameter conflict
> DETAIL:  EXTENDED versus PLAIN
> Command was: CREATE TABLE public.chld (
>      a text
> )
> INHERITS (public.par1, public.par2);
> 
> pg_restore: error: could not execute query: ERROR:  relation
> "public.chld" does not exist
> Command was: ALTER TABLE public.chld OWNER TO ashutosh;
> 
> pg_restore: error: could not execute query: ERROR:  relation
> "public.chld" does not exist
> Command was: COPY public.chld (a) FROM stdin;
> pg_restore: warning: errors ignored on restore: 3
> 
> Fixing this requires that we dump ALTER TABLE ... ALTER COLUMN SET
> STORAGE and COMPRESSION commands after all the tables (at least
> children) have been created. That seems to break the way we dump the
> whole table together right now. OR dump inherited tables like binary
> upgrade mode.
> 




Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Tue, Feb 20, 2024 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> I have reverted the patch for now (and re-opened the commitfest entry).
> We should continue to work on this and see if we can at least try to get
> the pg_dump test coverage suitable.
>

I have started a separate thread for dump/restore test. [1].

Using that test, I found an existing bug:
Consider
CREATE TABLE cminh6 (f1 TEXT);
ALTER TABLE cminh6 INHERIT cmparent1;
f1 remains without compression even after inherit per the current code.
But pg_dump dumps it out as
CREATE TABLE cminh6 (f1 TEXT) INHERIT(cmparent1)
Because of this after restoring cminh6::f1 inherits compression of
cmparent1. So before dump cminh6::f1 has no compression and after
restore it has compression.

I am not sure how to fix this. We want inheritance children to have
their on compression. So ALTER TABLE ... INHERIT ... no passing a
compression onto child is fine. CREATE TABLE .... INHERIT ... passing
compression onto the child being created also looks fine since that's
what we do with other attributes. Only solution I see is to introduce
"none" as a special compression method to indicate "no compression"
and store that instead of NULL in attcompression column. That looks
ugly.

Similar is the case with storage.

[1] https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2Wf61PT+hfquzjBqouRzQJQ@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat



Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
On Fri, Feb 23, 2024 at 6:05 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Feb 20, 2024 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> >
> > I have reverted the patch for now (and re-opened the commitfest entry).
> > We should continue to work on this and see if we can at least try to get
> > the pg_dump test coverage suitable.
> >
>
> I have started a separate thread for dump/restore test. [1].
>
> Using that test, I found an existing bug:
> Consider
> CREATE TABLE cminh6 (f1 TEXT);
> ALTER TABLE cminh6 INHERIT cmparent1;
> f1 remains without compression even after inherit per the current code.
> But pg_dump dumps it out as
> CREATE TABLE cminh6 (f1 TEXT) INHERIT(cmparent1)
> Because of this after restoring cminh6::f1 inherits compression of
> cmparent1. So before dump cminh6::f1 has no compression and after
> restore it has compression.
>
> I am not sure how to fix this. We want inheritance children to have
> their on compression. So ALTER TABLE ... INHERIT ... no passing a
> compression onto child is fine. CREATE TABLE .... INHERIT ... passing
> compression onto the child being created also looks fine since that's
> what we do with other attributes. Only solution I see is to introduce
> "none" as a special compression method to indicate "no compression"
> and store that instead of NULL in attcompression column. That looks
> ugly.

Specifying DEFAULT as COMPRESSION method instead of inventing "none"
works. We should do it only for INHERITed tables.

>
> Similar is the case with storage.
>

Similar to compression, for inherited tables we have to output STORAGE
clause even if it's default.

--
Best Wishes,
Ashutosh Bapat



Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:
Hi Peter and Tom,

> On Tue, Feb 20, 2024 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> >
> > I have reverted the patch for now (and re-opened the commitfest entry).
> > We should continue to work on this and see if we can at least try to get
> > the pg_dump test coverage suitable.
> >
>

The pg_dump problems arise because we throw an error when parents have conflicting compression and storage properties. The patch that got reverted, changed this slightly by allowing a child to override parent's properties even when they conflict. It still threw an error when child didn't override and parents conflicted. I guess, MergeAttributes() raises error when it encounters parents with conflicting properties because it can not decide which of the conflicting properties the child should inherit. Instead it could just set the DEFAULT properties when parent properties conflict but child doesn't override. Thus when compression conflicts, child's compression would be set to default and when storage conflicts it will be set to the type's default storage. Child's properties when specified explicitly would override always. This will solve all the pg_dump bugs we saw with the reverted patch and also existing bug I reported earlier.

This change would break backward compatibility but I don't think anybody would rely on error being thrown when parent properties conflict.

What do you think?

--
Best Wishes,
Ashutosh Bapat

Re: table inheritance versus column compression and storage settings

From
Peter Eisentraut
Date:
On 07.03.24 17:54, Ashutosh Bapat wrote:
> The pg_dump problems arise because we throw an error when parents have 
> conflicting compression and storage properties. The patch that got 
> reverted, changed this slightly by allowing a child to override parent's 
> properties even when they conflict. It still threw an error when child 
> didn't override and parents conflicted. I guess, MergeAttributes() 
> raises error when it encounters parents with conflicting properties 
> because it can not decide which of the conflicting properties the child 
> should inherit. Instead it could just set the DEFAULT properties when 
> parent properties conflict but child doesn't override. Thus when 
> compression conflicts, child's compression would be set to default and 
> when storage conflicts it will be set to the type's default storage. 
> Child's properties when specified explicitly would override always. This 
> will solve all the pg_dump bugs we saw with the reverted patch and also 
> existing bug I reported earlier.
> 
> This change would break backward compatibility but I don't think anybody 
> would rely on error being thrown when parent properties conflict.
> 
> What do you think?

At this point in the development cycle, I would rather not undertake 
such changes.  We have already discovered with the previous attempt that 
there are unexpected pitfalls and lacking test coverage.  Also, there 
isn't even a patch yet.  I suggest we drop this for now, or reconsider 
it for PG18, as you wish.




Re: table inheritance versus column compression and storage settings

From
Ashutosh Bapat
Date:


On Thu, Mar 21, 2024 at 3:19 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 07.03.24 17:54, Ashutosh Bapat wrote:
> The pg_dump problems arise because we throw an error when parents have
> conflicting compression and storage properties. The patch that got
> reverted, changed this slightly by allowing a child to override parent's
> properties even when they conflict. It still threw an error when child
> didn't override and parents conflicted. I guess, MergeAttributes()
> raises error when it encounters parents with conflicting properties
> because it can not decide which of the conflicting properties the child
> should inherit. Instead it could just set the DEFAULT properties when
> parent properties conflict but child doesn't override. Thus when
> compression conflicts, child's compression would be set to default and
> when storage conflicts it will be set to the type's default storage.
> Child's properties when specified explicitly would override always. This
> will solve all the pg_dump bugs we saw with the reverted patch and also
> existing bug I reported earlier.
>
> This change would break backward compatibility but I don't think anybody
> would rely on error being thrown when parent properties conflict.
>
> What do you think?

At this point in the development cycle, I would rather not undertake
such changes.  We have already discovered with the previous attempt that
there are unexpected pitfalls and lacking test coverage.  Also, there
isn't even a patch yet.  I suggest we drop this for now, or reconsider
it for PG18, as you wish.


I am fine with this. Should I mark the CF entry as RWF?

--
Best Wishes,
Ashutosh Bapat