Thread: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
jian he
Date:
hi. CREATE TABLE tp(c int, a int, b int) PARTITION BY RANGE (b); CREATE TABLE tp_1(c int, a int, b int); ALTER TABLE tp ATTACH PARTITION tp_1 FOR VALUES FROM (0) TO (1); CREATE INDEX t_a_idx ON tp_1(a); CREATE INDEX tp_a_idx ON tp(a); pg_dump --schema=public --if-exists --clean --no-statistics --no-owner --no-data --table-and-children=tp > 1.sql pg_dump output file 1.sql excerpts: ---- DROP INDEX IF EXISTS public.t_a_idx; DROP INDEX IF EXISTS public.tp_a_idx; DROP TABLE IF EXISTS public.tp_1; DROP TABLE IF EXISTS public.tp; ---- if you execute the DROP INDEX IF EXISTS public.t_a_idx; ERROR: cannot drop index t_a_idx because index tp_a_idx requires it HINT: You can drop index tp_a_idx instead. Is this pg_dump output what we expected? SELECT * FROM pg_partition_tree('tp_a_idx'); relid | parentrelid | isleaf | level ----------+-------------+--------+------- tp_a_idx | | f | 0 t_a_idx | tp_a_idx | t | 1 (2 rows)
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
Pavel Stehule
Date:
Hi
po 14. 4. 2025 v 7:54 odesílatel jian he <jian.universality@gmail.com> napsal:
hi.
CREATE TABLE tp(c int, a int, b int) PARTITION BY RANGE (b);
CREATE TABLE tp_1(c int, a int, b int);
ALTER TABLE tp ATTACH PARTITION tp_1 FOR VALUES FROM (0) TO (1);
CREATE INDEX t_a_idx ON tp_1(a);
CREATE INDEX tp_a_idx ON tp(a);
pg_dump --schema=public --if-exists --clean --no-statistics
--no-owner --no-data --table-and-children=tp > 1.sql
pg_dump output file 1.sql excerpts:
----
DROP INDEX IF EXISTS public.t_a_idx;
DROP INDEX IF EXISTS public.tp_a_idx;
DROP TABLE IF EXISTS public.tp_1;
DROP TABLE IF EXISTS public.tp;
----
if you execute the
DROP INDEX IF EXISTS public.t_a_idx;
ERROR: cannot drop index t_a_idx because index tp_a_idx requires it
HINT: You can drop index tp_a_idx instead.
Is this pg_dump output what we expected?
It is a bug, I think, the implementation of these parts of code is older than partitioning support, and doesn't do necessary detach.
regards
Pavel
SELECT * FROM pg_partition_tree('tp_a_idx');
relid | parentrelid | isleaf | level
----------+-------------+--------+-------
tp_a_idx | | f | 0
t_a_idx | tp_a_idx | t | 1
(2 rows)
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
jian he
Date:
On Mon, Apr 14, 2025 at 2:09 PM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Hi > > po 14. 4. 2025 v 7:54 odesílatel jian he <jian.universality@gmail.com> napsal: >> >> hi. >> >> CREATE TABLE tp(c int, a int, b int) PARTITION BY RANGE (b); >> CREATE TABLE tp_1(c int, a int, b int); >> ALTER TABLE tp ATTACH PARTITION tp_1 FOR VALUES FROM (0) TO (1); >> CREATE INDEX t_a_idx ON tp_1(a); >> CREATE INDEX tp_a_idx ON tp(a); >> >> pg_dump --schema=public --if-exists --clean --no-statistics >> --no-owner --no-data --table-and-children=tp > 1.sql >> pg_dump output file 1.sql excerpts: >> ---- >> DROP INDEX IF EXISTS public.t_a_idx; >> DROP INDEX IF EXISTS public.tp_a_idx; >> DROP TABLE IF EXISTS public.tp_1; >> DROP TABLE IF EXISTS public.tp; >> ---- >> if you execute the >> DROP INDEX IF EXISTS public.t_a_idx; >> >> ERROR: cannot drop index t_a_idx because index tp_a_idx requires it >> HINT: You can drop index tp_a_idx instead. >> >> Is this pg_dump output what we expected? >> > > It is a bug, I think, the implementation of these parts of code is older than partitioning support, and doesn't do necessarydetach. > seems pretty easy to fix. we only need dropStmt when IndxInfo->parentidx oid is invalid. + if (!OidIsValid(indxinfo->parentidx)) + appendPQExpBuffer(delq, "DROP INDEX %s;\n", qqindxname); I have tested the above changes on PG11, master.
Attachment
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
Pavel Stehule
Date:
út 15. 4. 2025 v 7:33 odesílatel jian he <jian.universality@gmail.com> napsal:
On Mon, Apr 14, 2025 at 2:09 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> Hi
>
> po 14. 4. 2025 v 7:54 odesílatel jian he <jian.universality@gmail.com> napsal:
>>
>> hi.
>>
>> CREATE TABLE tp(c int, a int, b int) PARTITION BY RANGE (b);
>> CREATE TABLE tp_1(c int, a int, b int);
>> ALTER TABLE tp ATTACH PARTITION tp_1 FOR VALUES FROM (0) TO (1);
>> CREATE INDEX t_a_idx ON tp_1(a);
>> CREATE INDEX tp_a_idx ON tp(a);
>>
>> pg_dump --schema=public --if-exists --clean --no-statistics
>> --no-owner --no-data --table-and-children=tp > 1.sql
>> pg_dump output file 1.sql excerpts:
>> ----
>> DROP INDEX IF EXISTS public.t_a_idx;
>> DROP INDEX IF EXISTS public.tp_a_idx;
>> DROP TABLE IF EXISTS public.tp_1;
>> DROP TABLE IF EXISTS public.tp;
>> ----
>> if you execute the
>> DROP INDEX IF EXISTS public.t_a_idx;
>>
>> ERROR: cannot drop index t_a_idx because index tp_a_idx requires it
>> HINT: You can drop index tp_a_idx instead.
>>
>> Is this pg_dump output what we expected?
>>
>
> It is a bug, I think, the implementation of these parts of code is older than partitioning support, and doesn't do necessary detach.
>
seems pretty easy to fix.
we only need dropStmt when IndxInfo->parentidx oid is invalid.
+ if (!OidIsValid(indxinfo->parentidx))
+ appendPQExpBuffer(delq, "DROP INDEX %s;\n", qqindxname);
I don't think it is the correct fix.
because then fails CREATE INDEX qqindxname
The motivation for usage --clean and --if-exists option is possibility to call restore idempotently. It should not fail when I do restore in an empty database, and it shouldn't fail if I do restore in an existing database.
Regards
Pavel
I have tested the above changes on PG11, master.
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
jian he
Date:
On Tue, Apr 15, 2025 at 1:45 PM Pavel Stehule <pavel.stehule@gmail.com> wrote: >> >> seems pretty easy to fix. >> we only need dropStmt when IndxInfo->parentidx oid is invalid. >> >> + if (!OidIsValid(indxinfo->parentidx)) >> + appendPQExpBuffer(delq, "DROP INDEX %s;\n", qqindxname); > > > I don't think it is the correct fix. > > because then fails CREATE INDEX qqindxname > > The motivation for usage --clean and --if-exists option is possibility to call restore idempotently. It should not failwhen I do restore in an empty database, and it shouldn't fail if I do restore in an existing database. > I am not sure what you mean fails ``CREATE INDEX qqindxname`` ? for example: >> >> CREATE TABLE tp(c int, a int, b int) PARTITION BY RANGE (b); >> >> CREATE TABLE tp_1(c int, a int, b int); >> >> ALTER TABLE tp ATTACH PARTITION tp_1 FOR VALUES FROM (0) TO (1); >> >> CREATE INDEX t_a_idx ON tp_1(a); >> >> CREATE INDEX tp_a_idx ON tp(a); with the patch, pg_dump output will be ``` DROP INDEX IF EXISTS public.tp_a_idx; DROP TABLE IF EXISTS public.tp_11; DROP TABLE IF EXISTS public.tp; .... CREATE INDEX tp_a_idx ON ONLY public.tp USING btree (a); CREATE INDEX tp_11_a_idx ON public.tp_11 USING btree (a); ALTER INDEX public.tp_a_idx ATTACH PARTITION public.tp_11_a_idx; ``` What's your expectation?
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
Pavel Stehule
Date:
út 15. 4. 2025 v 8:52 odesílatel jian he <jian.universality@gmail.com> napsal:
On Tue, Apr 15, 2025 at 1:45 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>
>> seems pretty easy to fix.
>> we only need dropStmt when IndxInfo->parentidx oid is invalid.
>>
>> + if (!OidIsValid(indxinfo->parentidx))
>> + appendPQExpBuffer(delq, "DROP INDEX %s;\n", qqindxname);
>
>
> I don't think it is the correct fix.
>
> because then fails CREATE INDEX qqindxname
>
> The motivation for usage --clean and --if-exists option is possibility to call restore idempotently. It should not fail when I do restore in an empty database, and it shouldn't fail if I do restore in an existing database.
>
I am not sure what you mean fails
``CREATE INDEX qqindxname``
?
for example:
>> >> CREATE TABLE tp(c int, a int, b int) PARTITION BY RANGE (b);
>> >> CREATE TABLE tp_1(c int, a int, b int);
>> >> ALTER TABLE tp ATTACH PARTITION tp_1 FOR VALUES FROM (0) TO (1);
>> >> CREATE INDEX t_a_idx ON tp_1(a);
>> >> CREATE INDEX tp_a_idx ON tp(a);
with the patch, pg_dump output will be
```
DROP INDEX IF EXISTS public.tp_a_idx;
DROP TABLE IF EXISTS public.tp_11;
DROP TABLE IF EXISTS public.tp;
....
CREATE INDEX tp_a_idx ON ONLY public.tp USING btree (a);
CREATE INDEX tp_11_a_idx ON public.tp_11 USING btree (a);
ALTER INDEX public.tp_a_idx ATTACH PARTITION public.tp_11_a_idx;
```
What's your expectation?
I am not sure at this moment.
There is a question why dump exports DROP INDEX, although the index will be dropped implicitly when table is dropped.
So your fix can work, because all indexes will be dropped by DROP TABLE, but on second hand it is not consistent with the current behavior of pg_dump, where indexes are dropped explicitly.
I have not knowledge, why pg_dump exports DROP INDEX explicitly, although it is redundant.
Regards
Pavel
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes: > I have not knowledge, why pg_dump exports DROP INDEX explicitly, although > it is redundant. I concur that the proposed patch is probably not the right fix. It just looks wrong and arbitrary. My first thought about the right way to fix it was that the INDEX ATTACH TOC entry should have a dropStmt that is "ALTER ... DETACH PARTITION", since the way --clean works is to issue the dropStmts in reverse order. However, there's no ALTER INDEX DETACH PARTITION command. Also, looking at dumpIndexAttach, there's a comment saying very explicitly not to: /* * There is no point in creating a drop query as the drop is done by * index drop. (If you think to change this, see also * _printTocEntry().) Although this object doesn't really have * ownership as such, set the owner field anyway to ensure that the * command is run by the correct role at restore time. */ My guess is that that was correct when written and has been falsified by later changes. So I think the next thing to do is run down exactly what broke it and then decide if that change was buggy or if the comment is just obsolete. If it's obsolete then we have to figure out what to do next. regards, tom lane
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
Tom Lane
Date:
After thinking about this a bit more: the problem is basically one of things being done in the wrong order. And the way to ensure that things get done in the right order is to set up the object dependencies correctly. So what we need to do is to switch the dependency direction between the child and parent index objects, as attached. The effect of this is to switch both the creation order and the drop order of the child and parent indexes. Normally, flipping the creation order would be problematic. But for partitioned indexes it doesn't matter because there's not actually any connection between the child and parent indexes until we issue INDEX ATTACH. The INDEX ATTACH object has dependencies on both, so it doesn't come out till after both, so all that still works. However ... this is still only a partial fix. It fixes the problem as-presented, where you used both --clean and --if-exists. But if you just say --clean, then the DROP command for the child index fails because it's already gone thanks to dropping the parent index first. Possibly the least messy way to deal with that is to always put IF EXISTS into the DROP command for a child index. This isn't totally problem-free, because then we would have to teach RestoreArchive to not insert IF EXISTS if one is already there, and that probably makes the fix not back-patchable for fear of compatibility problems with unpatched copies of pg_restore. Or we could do what Jian suggested and just not emit any dropStmt for child indexes. I continue to fear that that will have undesirable side-effects, but I have to admit that I'm not sure what. The fact that the backend will automatically drop the child index if we drop either its parent index or its table seems to cover a lot of the edge cases ... but does it cover all of them? regards, tom lane diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index c6e6d3b2b86..961b8f05eef 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -19630,13 +19630,25 @@ getDependencies(Archive *fout) * Ordinarily, table rowtypes have implicit dependencies on their * tables. However, for a composite type the implicit dependency goes * the other way in pg_depend; which is the right thing for DROP but - * it doesn't produce the dependency ordering we need. So in that one + * it doesn't produce the dependency ordering we need. So in that * case, we reverse the direction of the dependency. + * + * Similarly, the server's idea of the dependency direction for + * partitioned indexes is backwards for our purposes, so reverse it. + * This swap has no real effect during the object creation sequence, + * because the leaf and partitioned indexes are not connected until we + * issue INDEX ATTACH. What it does for us is to ensure that when + * --clean mode is used, the DROP INDEX commands come out in an order + * that the server will accept. */ if (deptype == 'i' && dobj->objType == DO_TABLE && refdobj->objType == DO_TYPE) addObjectDependency(refdobj, dobj->dumpId); + else if (deptype == 'P' && + dobj->objType == DO_INDEX && + refdobj->objType == DO_INDEX) + addObjectDependency(refdobj, dobj->dumpId); else /* normal case */ addObjectDependency(dobj, refdobj->dumpId);
Re: pg_dump --if-exists --clean when drop index that is partition of a partitioned index
From
jian he
Date:
On Wed, Apr 16, 2025 at 6:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Or we could do what Jian suggested and just not emit any dropStmt > for child indexes. I continue to fear that that will have > undesirable side-effects, but I have to admit that I'm not sure > what. The fact that the backend will automatically drop the > child index if we drop either its parent index or its table > seems to cover a lot of the edge cases ... but does it cover > all of them? > If pg_dump not produce "DROP INDEX IF EXISTS child_index;" then we are actually tests drop parent index will cascade to child index. If it fails, then execute pg_dump output, CREATE INDEX on child table will fail, error message be like: ERROR: relation "t_a_idx" already exists CREATE TABLE tp(c int, a int, b int) PARTITION BY RANGE (b); CREATE TABLE tp_1(c int, a int, b int); ALTER TABLE tp ATTACH PARTITION tp_1 FOR VALUES FROM (0) TO (1); CREATE INDEX t_a_idx ON tp_1(a); CREATE INDEX tp_a_idx ON tp(a); in this case, the partition index and partitioned index dependency relationship is DEPENDENCY_PARTITION_PRI. per https://www.postgresql.org/docs/current/catalog-pg-depend.html " Example: a child partitioned index is made partition-dependent on both the partition table it is on and the parent partitioned index, so that it goes away if either of those is dropped, but not otherwise. The dependency on the parent index is primary, so that if the user tries to drop the child partitioned index, the error message will suggest dropping the parent index instead (not the table). " from the doc, drop the partitioned index will cascade and also drop the partition index. On Tue, Apr 15, 2025 at 3:08 PM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > There is a question why dump exports DROP INDEX, although the index will be dropped implicitly when table is dropped. > > So your fix can work, because all indexes will be dropped by DROP TABLE, but on second hand it is not consistent with thecurrent behavior of pg_dump, where indexes are dropped explicitly. > > I have not knowledge, why pg_dump exports DROP INDEX explicitly, although it is redundant. > > Regards > > Pavel > if you specify ``--section=post-data``, then only the index command will be generated. so we actually do need separate dump exports DROP INDEX.