Thread: table inheritance versus column compression and storage settings
I found the way that MergeAttributes() handles column compression and storage settings very weird. For example, I don't understand the purpose of this error: create table t1 (a int, b text compression pglz); create table t1a (b text compression lz4) inherits (t1); ... ERROR: 42804: column "b" has a compression method conflict DETAIL: pglz versus lz4 or this: create table t2 (a int, b text compression lz4); create table t12 () inherits (t1, t2); ... ERROR: column "b" has a compression method conflict DETAIL: pglz versus lz4 And we can't override it in the child, per the first example. But this works: create table t1a (a int, b text compression lz4); alter table t1a inherit t1; Also, you can change the settings in the child using ALTER TABLE ... SET COMPRESSION (which is also how pg_dump will represent the above constructions), so the restrictions at CREATE TABLE time don't seem to make much sense. Looking at the code, I suspect these rules were just sort of copy-and-pasted from the nearby rules for types and collations. The latter are needed so that a table with inheritance children can present a logically consistent view of the data. But compression and storage are physical properties that are not logically visible, so every table in an inheritance hierarchy can have their own setting. I think the rules should be approximately like this (both for compression and storage): - A newly created child inherits the settings from the parent. - A newly created child can override the settings. - Attaching a child changes nothing. - When inheriting from multiple parents with different settings, an explicit setting in the child is required. Thoughts?
On Mon, Dec 4, 2023 at 4:23 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > Looking at the code, I suspect these rules were just sort of > copy-and-pasted from the nearby rules for types and collations. The > latter are needed so that a table with inheritance children can present > a logically consistent view of the data. But compression and storage > are physical properties that are not logically visible, so every table > in an inheritance hierarchy can have their own setting. Incidentally I was looking at that code yesterday and had the same thoughts. > > I think the rules should be approximately like this (both for > compression and storage): > > - A newly created child inherits the settings from the parent. > - A newly created child can override the settings. > - Attaching a child changes nothing. Looks fine to me. > - When inheriting from multiple parents with different settings, an > explicit setting in the child is required. When no explicit setting for child is specified, it will throw an error as it does today. Right? -- Best Wishes, Ashutosh Bapat
On 05.12.23 05:26, Ashutosh Bapat wrote: >> - When inheriting from multiple parents with different settings, an >> explicit setting in the child is required. > When no explicit setting for child is specified, it will throw an > error as it does today. Right? Yes, it would throw an error, but a different error than today, saying something like "the settings in the parents conflict, so you need to specify one here to override the conflict".
On Tue, Dec 5, 2023 at 6:28 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 05.12.23 05:26, Ashutosh Bapat wrote: > >> - When inheriting from multiple parents with different settings, an > >> explicit setting in the child is required. > > When no explicit setting for child is specified, it will throw an > > error as it does today. Right? > > Yes, it would throw an error, but a different error than today, saying > something like "the settings in the parents conflict, so you need to > specify one here to override the conflict". > PFA patch fixing inheritance and compression. It also fixes a crash reported in [1]. The storage looks more involved. The way it has been coded, the child always inherits the parent's storage properties. #create table t1 (a text storage plain); CREATE TABLE #create table c1 (b text storage main) inherits(t1); CREATE TABLE #create table c1 (a text storage main) inherits(t1); NOTICE: merging column "a" with inherited definition CREATE TABLE #\d+ t1 Table "public.t1" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description --------+------+-----------+----------+---------+---------+-------------+--------------+------------- a | text | | | | plain | | | Child tables: c1 Access method: heap #\d+ c1 Table "public.c1" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description --------+------+-----------+----------+---------+---------+-------------+--------------+------------- a | text | | | | plain | | | Inherits: t1 Access method: heap Observe that c1.a did not have storage "main" but instead inherits "plain" from t1. According to the code at https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L3253, there is supposed to be a conflict error. But that does not happen since child's storage specification is in ColumnDef::storage_name which is never consulted. The ColumnDef::storage_name is converted to ColumnDef::storage only in BuildDescForRelation(), after MergeAttribute() has been finished. There are two ways to fix this 1. In MergeChildAttribute() resolve ColumnDef::storage_name to ColumnDef::storage before comparing it against inherited property. I don't like this approach since a. we duplicate the conversion logic in MergeChildAttribute() and BuildDescForRelation(), b. the conversion happens only for the attributes which are inherited. 2. Deal with it the same way as compression. Get rid of ColumnDef::storage altogether. Instead set ColumnDef::storage_name at https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L2723. I am inclined to take the second approach. Let me know if you feel otherwise. [1] https://www.postgresql.org/message-id/b22a6834-aacb-7b18-0424-a3f5fe889667@gmail.com -- Best Wishes, Ashutosh Bapat
On Wed, Jan 31, 2024 at 1:29 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Tue, Dec 5, 2023 at 6:28 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > > > On 05.12.23 05:26, Ashutosh Bapat wrote: > > >> - When inheriting from multiple parents with different settings, an > > >> explicit setting in the child is required. > > > When no explicit setting for child is specified, it will throw an > > > error as it does today. Right? > > > > Yes, it would throw an error, but a different error than today, saying > > something like "the settings in the parents conflict, so you need to > > specify one here to override the conflict". > > > > PFA patch fixing inheritance and compression. It also fixes a crash > reported in [1]. > > The storage looks more involved. The way it has been coded, the child > always inherits the parent's storage properties. > #create table t1 (a text storage plain); > CREATE TABLE > #create table c1 (b text storage main) inherits(t1); > CREATE TABLE > #create table c1 (a text storage main) inherits(t1); > NOTICE: merging column "a" with inherited definition > CREATE TABLE > #\d+ t1 > Table "public.t1" > Column | Type | Collation | Nullable | Default | Storage | > Compression | Stats target | Description > --------+------+-----------+----------+---------+---------+-------------+--------------+------------- > a | text | | | | plain | > | | > Child tables: c1 > Access method: heap > #\d+ c1 > Table "public.c1" > Column | Type | Collation | Nullable | Default | Storage | > Compression | Stats target | Description > --------+------+-----------+----------+---------+---------+-------------+--------------+------------- > a | text | | | | plain | > | | > Inherits: t1 > Access method: heap > > Observe that c1.a did not have storage "main" but instead inherits > "plain" from t1. > > According to the code at > https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L3253, > there is supposed to be a conflict error. But that does not happen > since child's storage specification is in ColumnDef::storage_name > which is never consulted. The ColumnDef::storage_name is converted to > ColumnDef::storage only in BuildDescForRelation(), after > MergeAttribute() has been finished. There are two ways to fix this > 1. In MergeChildAttribute() resolve ColumnDef::storage_name to > ColumnDef::storage before comparing it against inherited property. I > don't like this approach since a. we duplicate the conversion logic in > MergeChildAttribute() and BuildDescForRelation(), b. the conversion > happens only for the attributes which are inherited. > > 2. Deal with it the same way as compression. Get rid of > ColumnDef::storage altogether. Instead set ColumnDef::storage_name at > https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L2723. > > I am inclined to take the second approach. Let me know if you feel otherwise. Took the second approach. PFA patches 0001 fixes compression inheritance 0002 fixes storage inheritance The patches may be committed separately or as a single patch. Keeping them separate in case we decide to commit one but not the other. We always set storage even if it's not specified, in which case the column type's default storage is used. This is slightly different from compression which defaults to the GUC's value if not set. This has led to slight difference in the tests. -- Best Wishes, Ashutosh Bapat
Attachment
On Wed, Feb 7, 2024 at 12:47 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > 0001 fixes compression inheritance > 0002 fixes storage inheritance > The first patch does not update compression_1.out which makes CI unhappy. Here's patchset fixing that. -- Best Wishes, Ashutosh Bapat
Attachment
On 08.02.24 08:20, Ashutosh Bapat wrote: > On Wed, Feb 7, 2024 at 12:47 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > >> 0001 fixes compression inheritance >> 0002 fixes storage inheritance >> > > The first patch does not update compression_1.out which makes CI > unhappy. Here's patchset fixing that. The changed behavior looks good to me. The tests are good, the code changes are pretty straightforward. Did you by any change check that pg_dump dumps the resulting structures correctly? I notice in tablecmds.c that ALTER COLUMN SET STORAGE recurses but ALTER COLUMN SET COMPRESSION does not. I don't understand why that is, and I wonder whether it affects pg_dump.
On Mon, Feb 12, 2024 at 8:48 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 08.02.24 08:20, Ashutosh Bapat wrote: > > On Wed, Feb 7, 2024 at 12:47 PM Ashutosh Bapat > > <ashutosh.bapat.oss@gmail.com> wrote: > > > >> 0001 fixes compression inheritance > >> 0002 fixes storage inheritance > >> > > > > The first patch does not update compression_1.out which makes CI > > unhappy. Here's patchset fixing that. > > The changed behavior looks good to me. The tests are good, the code > changes are pretty straightforward. > > Did you by any change check that pg_dump dumps the resulting structures > correctly? I notice in tablecmds.c that ALTER COLUMN SET STORAGE > recurses but ALTER COLUMN SET COMPRESSION does not. I don't understand > why that is, and I wonder whether it affects pg_dump. > I used src/bin/pg_upgrade/t/002_pg_upgrade.pl to test dump and restore by leaving back the new objects created in compression.sql and inherit.sql. COMPRESSION is set using ALTER TABLE ONLY so it affects only the parent and should not propagate to children. A child inherits the parent first and then changes compression property. For example ``` CREATE TABLE public.cmparent1 ( f1 text ); ALTER TABLE ONLY public.cmparent1 ALTER COLUMN f1 SET COMPRESSION pglz; CREATE TABLE public.cminh1 ( f1 text ) INHERITS (public.cmparent1); ALTER TABLE ONLY public.cminh1 ALTER COLUMN f1 SET COMPRESSION lz4; ``` Same is true with the STORAGE parameter. Example ``` CREATE TABLE public.stparent1 ( a text ); ALTER TABLE ONLY public.stparent1 ALTER COLUMN a SET STORAGE PLAIN; CREATE TABLE public.stchild1 ( a text ) INHERITS (public.stparent1); ALTER TABLE ONLY public.stchild1 ALTER COLUMN a SET STORAGE PLAIN; ``` I don't think pg_dump would be affected by the difference you noted. -- Best Wishes, Ashutosh Bapat
I have committed this. It is great to get this behavior fixed and also to get the internals more consistent. Thanks.
Peter Eisentraut <peter@eisentraut.org> writes: > I have committed this. It is great to get this behavior fixed and also > to get the internals more consistent. Thanks. I find it surprising that the committed patch does not touch pg_dump. Is it really true that pg_dump dumps situations with differing compression/storage settings accurately already? (Note that it proves little that the pg_upgrade test passes, since if pg_dump were blind to the settings applicable to a child table, the second dump would still be blind to them.) regards, tom lane
I wrote: > I find it surprising that the committed patch does not touch > pg_dump. Is it really true that pg_dump dumps situations with > differing compression/storage settings accurately already? It's worse than I thought. Run "make installcheck" with today's HEAD, then: $ pg_dump -Fc regression >r.dump $ createdb r2 $ pg_restore -d r2 r.dump pg_restore: error: could not execute query: ERROR: column "a" inherits conflicting storage methods HINT: To resolve the conflict, specify a storage method explicitly. Command was: CREATE TABLE public.stchild4 ( a text ) INHERITS (public.stparent1, public.stparent2); ALTER TABLE ONLY public.stchild4 ALTER COLUMN a SET STORAGE MAIN; pg_restore: error: could not execute query: ERROR: relation "public.stchild4" does not exist Command was: ALTER TABLE public.stchild4 OWNER TO postgres; pg_restore: error: could not execute query: ERROR: relation "public.stchild4" does not exist Command was: COPY public.stchild4 (a) FROM stdin; pg_restore: warning: errors ignored on restore: 3 What I'd intended to compare was the results of the query added to the regression tests: regression=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute WHERE (attrelid::regclass::name like 'stparent%' OR attrelid::regclass::name like 'stchild%') and attname = 'a' ORDER BY 1, 2; attrelid | attname | attstorage -----------+---------+------------ stparent1 | a | p stparent2 | a | x stchild1 | a | p stchild3 | a | m stchild4 | a | m stchild5 | a | x stchild6 | a | m (7 rows) r2=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute WHERE (attrelid::regclass::name like 'stparent%' OR attrelid::regclass::name like 'stchild%') and attname = 'a' ORDER BY 1, 2; attrelid | attname | attstorage -----------+---------+------------ stparent1 | a | p stchild1 | a | p stchild3 | a | m stparent2 | a | x stchild5 | a | p stchild6 | a | m (6 rows) So not only does stchild4 fail to restore altogether, but stchild5 ends with the wrong attstorage. This patch definitely needs more work. regards, tom lane
On Fri, Feb 16, 2024 at 11:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: > > I find it surprising that the committed patch does not touch > > pg_dump. Is it really true that pg_dump dumps situations with > > differing compression/storage settings accurately already? > > It's worse than I thought. Run "make installcheck" with > today's HEAD, then: > > $ pg_dump -Fc regression >r.dump > $ createdb r2 > $ pg_restore -d r2 r.dump > pg_restore: error: could not execute query: ERROR: column "a" inherits conflicting storage methods > HINT: To resolve the conflict, specify a storage method explicitly. > Command was: CREATE TABLE public.stchild4 ( > a text > ) > INHERITS (public.stparent1, public.stparent2); > ALTER TABLE ONLY public.stchild4 ALTER COLUMN a SET STORAGE MAIN; > > > pg_restore: error: could not execute query: ERROR: relation "public.stchild4" does not exist > Command was: ALTER TABLE public.stchild4 OWNER TO postgres; > > pg_restore: error: could not execute query: ERROR: relation "public.stchild4" does not exist > Command was: COPY public.stchild4 (a) FROM stdin; > pg_restore: warning: errors ignored on restore: 3 Thanks for the test. Let's call this Problem1. I expected src/bin/pg_upgrade/t/002_pg_upgrade.pl to fail in this case since it will execute similar steps as you did. And it actually does, except that it uses binary-upgrade mode. In that mode, INHERITed tables are dumped in a different manner -- For binary upgrade, set up inheritance this way. ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent1"; ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent2"; ... snip ... ALTER TABLE ONLY "public"."stchild4" ALTER COLUMN "a" SET STORAGE MAIN; that does not lead to the conflict and pg_upgrade does not fail. > > > What I'd intended to compare was the results of the query added to the > regression tests: > > regression=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute > WHERE (attrelid::regclass::name like 'stparent%' > OR attrelid::regclass::name like 'stchild%') > and attname = 'a' > ORDER BY 1, 2; > attrelid | attname | attstorage > -----------+---------+------------ > stparent1 | a | p > stparent2 | a | x > stchild1 | a | p > stchild3 | a | m > stchild4 | a | m > stchild5 | a | x > stchild6 | a | m > (7 rows) > > r2=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute > WHERE (attrelid::regclass::name like 'stparent%' > OR attrelid::regclass::name like 'stchild%') > and attname = 'a' > ORDER BY 1, 2; > attrelid | attname | attstorage > -----------+---------+------------ > stparent1 | a | p > stchild1 | a | p > stchild3 | a | m > stparent2 | a | x > stchild5 | a | p > stchild6 | a | m > (6 rows) > > So not only does stchild4 fail to restore altogether, but stchild5 > ends with the wrong attstorage. With binary-upgrade dump and restore stchild5 gets the correct storage value. Looks like we need a test which pg_dump s regression database and restores it without going through pg_upgrade. I think the fix is easy one. Dump the STORAGE and COMPRESSION clauses with CREATE TABLE for local attributes. Those for inherited attributes will be dumped separately. But that will not fix an existing problem described below. Let's call it Problem2. With HEAD at commit 57f59396bb51953bb7b957780c7f1b7f67602125 (almost a month back) $ createdb regression $ psql -d regression #create table par1 (a text storage plain); #create table par2 (a text storage plain); #create table chld (a text) inherits (par1, par2); NOTICE: merging multiple inherited definitions of column "a" NOTICE: merging column "a" with inherited definition -- parent storages conflict after child creation #alter table par1 alter column a set storage extended; #SELECT attrelid::regclass, attname, attstorage FROM pg_attribute WHERE (attrelid::regclass::name like 'par%' OR attrelid::regclass::name like 'chld%') and attname = 'a' ORDER BY 1, 2; attrelid | attname | attstorage ----------+---------+------------ par1 | a | x par2 | a | p chld | a | x (3 rows) $ createdb r2 $ pg_dump -Fc regression > /tmp/r.dump $ pg_restore -d r2 /tmp/r.dump pg_restore: error: could not execute query: ERROR: inherited column "a" has a storage parameter conflict DETAIL: EXTENDED versus PLAIN Command was: CREATE TABLE public.chld ( a text ) INHERITS (public.par1, public.par2); pg_restore: error: could not execute query: ERROR: relation "public.chld" does not exist Command was: ALTER TABLE public.chld OWNER TO ashutosh; pg_restore: error: could not execute query: ERROR: relation "public.chld" does not exist Command was: COPY public.chld (a) FROM stdin; pg_restore: warning: errors ignored on restore: 3 Fixing this requires that we dump ALTER TABLE ... ALTER COLUMN SET STORAGE and COMPRESSION commands after all the tables (at least children) have been created. That seems to break the way we dump the whole table together right now. OR dump inherited tables like binary upgrade mode. -- Best Wishes, Ashutosh Bapat
I have reverted the patch for now (and re-opened the commitfest entry). We should continue to work on this and see if we can at least try to get the pg_dump test coverage suitable. On 19.02.24 12:34, Ashutosh Bapat wrote: > On Fri, Feb 16, 2024 at 11:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> I wrote: >>> I find it surprising that the committed patch does not touch >>> pg_dump. Is it really true that pg_dump dumps situations with >>> differing compression/storage settings accurately already? >> >> It's worse than I thought. Run "make installcheck" with >> today's HEAD, then: >> >> $ pg_dump -Fc regression >r.dump >> $ createdb r2 >> $ pg_restore -d r2 r.dump >> pg_restore: error: could not execute query: ERROR: column "a" inherits conflicting storage methods >> HINT: To resolve the conflict, specify a storage method explicitly. >> Command was: CREATE TABLE public.stchild4 ( >> a text >> ) >> INHERITS (public.stparent1, public.stparent2); >> ALTER TABLE ONLY public.stchild4 ALTER COLUMN a SET STORAGE MAIN; >> >> >> pg_restore: error: could not execute query: ERROR: relation "public.stchild4" does not exist >> Command was: ALTER TABLE public.stchild4 OWNER TO postgres; >> >> pg_restore: error: could not execute query: ERROR: relation "public.stchild4" does not exist >> Command was: COPY public.stchild4 (a) FROM stdin; >> pg_restore: warning: errors ignored on restore: 3 > > Thanks for the test. Let's call this Problem1. I expected > src/bin/pg_upgrade/t/002_pg_upgrade.pl to fail in this case since it > will execute similar steps as you did. And it actually does, except > that it uses binary-upgrade mode. In that mode, INHERITed tables are > dumped in a different manner > -- For binary upgrade, set up inheritance this way. > ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent1"; > ALTER TABLE ONLY "public"."stchild4" INHERIT "public"."stparent2"; > ... snip ... > ALTER TABLE ONLY "public"."stchild4" ALTER COLUMN "a" SET STORAGE MAIN; > > that does not lead to the conflict and pg_upgrade does not fail. > >> >> >> What I'd intended to compare was the results of the query added to the >> regression tests: >> >> regression=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute >> WHERE (attrelid::regclass::name like 'stparent%' >> OR attrelid::regclass::name like 'stchild%') >> and attname = 'a' >> ORDER BY 1, 2; >> attrelid | attname | attstorage >> -----------+---------+------------ >> stparent1 | a | p >> stparent2 | a | x >> stchild1 | a | p >> stchild3 | a | m >> stchild4 | a | m >> stchild5 | a | x >> stchild6 | a | m >> (7 rows) >> >> r2=# SELECT attrelid::regclass, attname, attstorage FROM pg_attribute >> WHERE (attrelid::regclass::name like 'stparent%' >> OR attrelid::regclass::name like 'stchild%') >> and attname = 'a' >> ORDER BY 1, 2; >> attrelid | attname | attstorage >> -----------+---------+------------ >> stparent1 | a | p >> stchild1 | a | p >> stchild3 | a | m >> stparent2 | a | x >> stchild5 | a | p >> stchild6 | a | m >> (6 rows) >> >> So not only does stchild4 fail to restore altogether, but stchild5 >> ends with the wrong attstorage. > > With binary-upgrade dump and restore stchild5 gets the correct storage value. > > Looks like we need a test which pg_dump s regression database and > restores it without going through pg_upgrade. > > I think the fix is easy one. Dump the STORAGE and COMPRESSION clauses > with CREATE TABLE for local attributes. Those for inherited attributes > will be dumped separately. > > But that will not fix an existing problem described below. Let's call > it Problem2. With HEAD at commit > 57f59396bb51953bb7b957780c7f1b7f67602125 (almost a month back) > $ createdb regression > $ psql -d regression > #create table par1 (a text storage plain); > #create table par2 (a text storage plain); > #create table chld (a text) inherits (par1, par2); > NOTICE: merging multiple inherited definitions of column "a" > NOTICE: merging column "a" with inherited definition > -- parent storages conflict after child creation > #alter table par1 alter column a set storage extended; > #SELECT attrelid::regclass, attname, attstorage FROM pg_attribute > WHERE (attrelid::regclass::name like 'par%' > OR attrelid::regclass::name like 'chld%') > and attname = 'a' > ORDER BY 1, 2; > attrelid | attname | attstorage > ----------+---------+------------ > par1 | a | x > par2 | a | p > chld | a | x > (3 rows) > > $ createdb r2 > $ pg_dump -Fc regression > /tmp/r.dump > $ pg_restore -d r2 /tmp/r.dump > pg_restore: error: could not execute query: ERROR: inherited column > "a" has a storage parameter conflict > DETAIL: EXTENDED versus PLAIN > Command was: CREATE TABLE public.chld ( > a text > ) > INHERITS (public.par1, public.par2); > > pg_restore: error: could not execute query: ERROR: relation > "public.chld" does not exist > Command was: ALTER TABLE public.chld OWNER TO ashutosh; > > pg_restore: error: could not execute query: ERROR: relation > "public.chld" does not exist > Command was: COPY public.chld (a) FROM stdin; > pg_restore: warning: errors ignored on restore: 3 > > Fixing this requires that we dump ALTER TABLE ... ALTER COLUMN SET > STORAGE and COMPRESSION commands after all the tables (at least > children) have been created. That seems to break the way we dump the > whole table together right now. OR dump inherited tables like binary > upgrade mode. >
On Tue, Feb 20, 2024 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > I have reverted the patch for now (and re-opened the commitfest entry). > We should continue to work on this and see if we can at least try to get > the pg_dump test coverage suitable. > I have started a separate thread for dump/restore test. [1]. Using that test, I found an existing bug: Consider CREATE TABLE cminh6 (f1 TEXT); ALTER TABLE cminh6 INHERIT cmparent1; f1 remains without compression even after inherit per the current code. But pg_dump dumps it out as CREATE TABLE cminh6 (f1 TEXT) INHERIT(cmparent1) Because of this after restoring cminh6::f1 inherits compression of cmparent1. So before dump cminh6::f1 has no compression and after restore it has compression. I am not sure how to fix this. We want inheritance children to have their on compression. So ALTER TABLE ... INHERIT ... no passing a compression onto child is fine. CREATE TABLE .... INHERIT ... passing compression onto the child being created also looks fine since that's what we do with other attributes. Only solution I see is to introduce "none" as a special compression method to indicate "no compression" and store that instead of NULL in attcompression column. That looks ugly. Similar is the case with storage. [1] https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2Wf61PT+hfquzjBqouRzQJQ@mail.gmail.com -- Best Wishes, Ashutosh Bapat
On Fri, Feb 23, 2024 at 6:05 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Tue, Feb 20, 2024 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > > > I have reverted the patch for now (and re-opened the commitfest entry). > > We should continue to work on this and see if we can at least try to get > > the pg_dump test coverage suitable. > > > > I have started a separate thread for dump/restore test. [1]. > > Using that test, I found an existing bug: > Consider > CREATE TABLE cminh6 (f1 TEXT); > ALTER TABLE cminh6 INHERIT cmparent1; > f1 remains without compression even after inherit per the current code. > But pg_dump dumps it out as > CREATE TABLE cminh6 (f1 TEXT) INHERIT(cmparent1) > Because of this after restoring cminh6::f1 inherits compression of > cmparent1. So before dump cminh6::f1 has no compression and after > restore it has compression. > > I am not sure how to fix this. We want inheritance children to have > their on compression. So ALTER TABLE ... INHERIT ... no passing a > compression onto child is fine. CREATE TABLE .... INHERIT ... passing > compression onto the child being created also looks fine since that's > what we do with other attributes. Only solution I see is to introduce > "none" as a special compression method to indicate "no compression" > and store that instead of NULL in attcompression column. That looks > ugly. Specifying DEFAULT as COMPRESSION method instead of inventing "none" works. We should do it only for INHERITed tables. > > Similar is the case with storage. > Similar to compression, for inherited tables we have to output STORAGE clause even if it's default. -- Best Wishes, Ashutosh Bapat
Hi Peter and Tom,
> On Tue, Feb 20, 2024 at 3:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> >
> > I have reverted the patch for now (and re-opened the commitfest entry).
> > We should continue to work on this and see if we can at least try to get
> > the pg_dump test coverage suitable.
> >
>
The pg_dump problems arise because we throw an error when parents have conflicting compression and storage properties. The patch that got reverted, changed this slightly by allowing a child to override parent's properties even when they conflict. It still threw an error when child didn't override and parents conflicted. I guess, MergeAttributes() raises error when it encounters parents with conflicting properties because it can not decide which of the conflicting properties the child should inherit. Instead it could just set the DEFAULT properties when parent properties conflict but child doesn't override. Thus when compression conflicts, child's compression would be set to default and when storage conflicts it will be set to the type's default storage. Child's properties when specified explicitly would override always. This will solve all the pg_dump bugs we saw with the reverted patch and also existing bug I reported earlier.
This change would break backward compatibility but I don't think anybody would rely on error being thrown when parent properties conflict.
What do you think?
--
Best Wishes,
Ashutosh Bapat
On 07.03.24 17:54, Ashutosh Bapat wrote: > The pg_dump problems arise because we throw an error when parents have > conflicting compression and storage properties. The patch that got > reverted, changed this slightly by allowing a child to override parent's > properties even when they conflict. It still threw an error when child > didn't override and parents conflicted. I guess, MergeAttributes() > raises error when it encounters parents with conflicting properties > because it can not decide which of the conflicting properties the child > should inherit. Instead it could just set the DEFAULT properties when > parent properties conflict but child doesn't override. Thus when > compression conflicts, child's compression would be set to default and > when storage conflicts it will be set to the type's default storage. > Child's properties when specified explicitly would override always. This > will solve all the pg_dump bugs we saw with the reverted patch and also > existing bug I reported earlier. > > This change would break backward compatibility but I don't think anybody > would rely on error being thrown when parent properties conflict. > > What do you think? At this point in the development cycle, I would rather not undertake such changes. We have already discovered with the previous attempt that there are unexpected pitfalls and lacking test coverage. Also, there isn't even a patch yet. I suggest we drop this for now, or reconsider it for PG18, as you wish.
On Thu, Mar 21, 2024 at 3:19 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 07.03.24 17:54, Ashutosh Bapat wrote:
> The pg_dump problems arise because we throw an error when parents have
> conflicting compression and storage properties. The patch that got
> reverted, changed this slightly by allowing a child to override parent's
> properties even when they conflict. It still threw an error when child
> didn't override and parents conflicted. I guess, MergeAttributes()
> raises error when it encounters parents with conflicting properties
> because it can not decide which of the conflicting properties the child
> should inherit. Instead it could just set the DEFAULT properties when
> parent properties conflict but child doesn't override. Thus when
> compression conflicts, child's compression would be set to default and
> when storage conflicts it will be set to the type's default storage.
> Child's properties when specified explicitly would override always. This
> will solve all the pg_dump bugs we saw with the reverted patch and also
> existing bug I reported earlier.
>
> This change would break backward compatibility but I don't think anybody
> would rely on error being thrown when parent properties conflict.
>
> What do you think?
At this point in the development cycle, I would rather not undertake
such changes. We have already discovered with the previous attempt that
there are unexpected pitfalls and lacking test coverage. Also, there
isn't even a patch yet. I suggest we drop this for now, or reconsider
it for PG18, as you wish.
I am fine with this. Should I mark the CF entry as RWF?
--
Best Wishes,
Ashutosh Bapat