Thread: Change a constraint's index - ALTER TABLE ... ALTER CONSTRAINT ... USING INDEX ...

Hello, hackers!

I'd like to propose a feature for changing a constraint's index. The
provided patch allows to do it for EXCLUDE, UNIQUE, PRIMARY KEY and
FOREIGN KEY constraints.

Feature description:
ALTER TABLE ... ALTER CONSTRAINT ... USING INDEX ...
Replace a constraint's index with another sufficiently similar index.

Use cases:
    - Removing index bloat [1] (now also achieved by REINDEX 
CONCURRENTLY)
    - Swapping a normal index for an index with INCLUDED columns, or vice 
versa

Example of use:
CREATE TABLE target_tbl (
id integer PRIMARY KEY,
info text
);
CREATE TABLE referencing_tbl (
id_ref integer REFERENCES target_tbl (id)
);
-- Swapping primary key's index for an equivalent index,
-- but with INCLUDE-d attributes.
CREATE UNIQUE INDEX new_idx ON target_tbl (id) INCLUDE (info);
ALTER TABLE target_tbl ALTER CONSTRAINT target_tbl_pkey USING INDEX
new_idx;
ALTER TABLE referencing_tbl ALTER CONSTRAINT referencing_tbl_id_ref_fkey
USING INDEX new_idx;
DROP INDEX target_tbl_pkey;

I'd like to hear your feedback on this feature.
Also, some questions:
1) If the index supporting a UNIQUE or PRIMARY KEY constraint is
changed, should foreign keys also automatically switch to the new index?
Or should the user switch it manually, by using ALTER CONSTRAINT USING
INDEX on the foreign key?
2) Whose name should change to fit the other - constraint's or index's?

[1] 
https://www.postgresql.org/message-id/flat/CABwTF4UxTg%2BkERo1Nd4dt%2BH2miJoLPcASMFecS1-XHijABOpPg%40mail.gmail.com

P.S. I apologize for resending the email, the previous one was sent as a 
response to another thread by mistake.

-- 
Anna Akenteva
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment
On 2020-Jul-05, Anna Akenteva wrote:

> -- Swapping primary key's index for an equivalent index,
> -- but with INCLUDE-d attributes.
> CREATE UNIQUE INDEX new_idx ON target_tbl (id) INCLUDE (info);
> ALTER TABLE target_tbl ALTER CONSTRAINT target_tbl_pkey USING INDEX
> new_idx;
> ALTER TABLE referencing_tbl ALTER CONSTRAINT referencing_tbl_id_ref_fkey
> USING INDEX new_idx;

How is this state represented by pg_dump?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2020-Jul-05, Anna Akenteva wrote:
>> -- Swapping primary key's index for an equivalent index,
>> -- but with INCLUDE-d attributes.
>> CREATE UNIQUE INDEX new_idx ON target_tbl (id) INCLUDE (info);
>> ALTER TABLE target_tbl ALTER CONSTRAINT target_tbl_pkey USING INDEX
>> new_idx;
>> ALTER TABLE referencing_tbl ALTER CONSTRAINT referencing_tbl_id_ref_fkey
>> USING INDEX new_idx;

> How is this state represented by pg_dump?

Even if it's possible to represent, I think we should flat out reject
this "feature".  Primary keys that aren't primary keys don't seem like
a good idea.  For one thing, it won't be possible to describe the
constraint accurately in the information_schema.

            regards, tom lane



On 2020-07-07 01:08, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> On 2020-Jul-05, Anna Akenteva wrote:
>>> -- Swapping primary key's index for an equivalent index,
>>> -- but with INCLUDE-d attributes.
>>> CREATE UNIQUE INDEX new_idx ON target_tbl (id) INCLUDE (info);
>>> ALTER TABLE target_tbl ALTER CONSTRAINT target_tbl_pkey USING INDEX
>>> new_idx;
>>> ALTER TABLE referencing_tbl ALTER CONSTRAINT 
>>> referencing_tbl_id_ref_fkey
>>> USING INDEX new_idx;
> 
>> How is this state represented by pg_dump?
> 
> Even if it's possible to represent, I think we should flat out reject
> this "feature".  Primary keys that aren't primary keys don't seem like
> a good idea.  For one thing, it won't be possible to describe the
> constraint accurately in the information_schema.

Do you think it could still be a good idea if we only swap the 
relfilenodes of indexes, as it was suggested in [1]? The original use 
case was getting rid of index bloat, which is now solved by REINDEX 
CONCURRENTLY, but this feature still has its own use case of adding 
INCLUDE-d columns to constraint indexes.

[1]
https://www.postgresql.org/message-id/flat/CABwTF4UxTg%2BkERo1Nd4dt%2BH2miJoLPcASMFecS1-XHijABOpPg%40mail.gmail.com

-- 
Anna Akenteva
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



On Mon, 2020-08-10 at 09:29 +0300, Anna Akenteva wrote:
> On 2020-07-07 01:08, Tom Lane wrote:
> 
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > > On 2020-Jul-05, Anna Akenteva wrote:
> > > > -- Swapping primary key's index for an equivalent index,
> > > > -- but with INCLUDE-d attributes.
> > > > CREATE UNIQUE INDEX new_idx ON target_tbl (id) INCLUDE (info);
> > > > ALTER TABLE target_tbl ALTER CONSTRAINT target_tbl_pkey USING INDEX
> > > > new_idx;
> > > > ALTER TABLE referencing_tbl ALTER CONSTRAINT 
> > > > referencing_tbl_id_ref_fkey
> > > > USING INDEX new_idx;
> > > How is this state represented by pg_dump?
> > Even if it's possible to represent, I think we should flat out reject
> > this "feature".  Primary keys that aren't primary keys don't seem like
> > a good idea.  For one thing, it won't be possible to describe the
> > constraint accurately in the information_schema.
> 
> 
> Do you think it could still be a good idea if we only swap the 
> relfilenodes of indexes, as it was suggested in [1]? The original use 
> case was getting rid of index bloat, which is now solved by REINDEX 
> CONCURRENTLY, but this feature still has its own use case of adding 
> INCLUDE-d columns to constraint indexes.

How can you just swap the filenodes if "indnatts" and "indkey" is
different, since one index has an INCLUDE clause?

I think that the original proposal is better, except that foreign key
dependencies should be changed along with the primary or unique index,
so that everything is consistent once the command is done.

Then the ALTER CONSTRAINT from that replaces the index referenced
by a foreign key becomes unnecessary and should be removed.

The value I see in this is:
- replacing a primary key index
- replacing the index behind a constraint targeted by a foreign key

Some code comments:

+   <varlistentry>
+    <term><literal>ALTER CONSTRAINT</literal> <replaceable class="parameter">constraint_name</replaceable> [USING
INDEX<replaceable class="para>
 
+    <listitem>
+     <para>
+      For uniqueness, primary key, and exclusion constraints, this form
+      replaces the original index and renames the constraint accordingly.

You forgot to mention foreign keys.

+   /* This function might need modificatoins if pg_index gets new fields */
+   Assert(Natts_pg_index == 20);

Typo.

+   if (!equal(RelationGetIndexExpressions(oldIndex),
+              RelationGetIndexExpressions(newIndex)))
+       return "Indexes must have the same non-column attributes";

Correct me if I am wrong, but constraint indexes can never use
expressions.  So this should be covered by comparing the key
attributes above (they would be 0 for an expression).

+   if (!equal(oldPredicate, newPredicate))
+   {
+       if (oldPredicate && newPredicate)
+           return "Indexes must have the same partial index predicates";
+       else
+           return "Either none or both indexes must have partial index predicates";
+   }

A constraint index can never have predicates.  Only the new index would
have to be checked.

+/*
+ * ALTER TABLE ALTER CONSTRAINT USING INDEX
+ *
+ * Replace an index of a constraint.
+ *
+ * Currently only works for UNIQUE, EXCLUSION and PRIMARY constraints.

You forgot foreign key constraints (although I think they should not be allowed).


I'll set the commitfest entry to "waiting for author".

Yours,
Laurenz Albe




On 2020-Sep-04, Laurenz Albe wrote:

> The value I see in this is:
> - replacing a primary key index
> - replacing the index behind a constraint targeted by a foreign key

But why is this better than using REINDEX CONCURRENTLY?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Fri, 2020-09-04 at 10:41 -0400, Alvaro Herrera wrote:
> > The value I see in this is:
> > - replacing a primary key index
> > - replacing the index behind a constraint targeted by a foreign key
> 
> But why is this better than using REINDEX CONCURRENTLY?

It is not better, but it can be used to replace a constraint index
with an index with a different INCLUDE clause, which is something
that cannot easily be done otherwise.

For exclusion constraints it is pretty useless, and for unique
constraints it can be worked around with CREATE UNIQUE INDEX CONCURRENTLY.

Admitted, the use case is pretty narrow, and I am not sure if it is
worth adding code and SQL syntax for that.

Yours,
Laurenz Albe




On 2020-Sep-04, Laurenz Albe wrote:

> On Fri, 2020-09-04 at 10:41 -0400, Alvaro Herrera wrote:
> > > The value I see in this is:
> > > - replacing a primary key index
> > > - replacing the index behind a constraint targeted by a foreign key
> > 
> > But why is this better than using REINDEX CONCURRENTLY?
> 
> It is not better, but it can be used to replace a constraint index
> with an index with a different INCLUDE clause, which is something
> that cannot easily be done otherwise.

I can see that there is value in having an index that serves both a
uniqueness constraint and coverage purposes.  But this seems a pretty
roundabout way to get that -- I think you should have to do "CREATE
UNIQUE INDEX ... INCLUDING ..." instead.  That way, the fact that this
is a Postgres extension remains clear.

55432 14devel 24138=# create table foo (a int not null, b int not null, c int);
CREATE TABLE
Duración: 1,775 ms
55432 14devel 24138=# create unique index on foo (a, b) include (c);
CREATE INDEX
Duración: 1,481 ms
55432 14devel 24138=# create table bar (a int not null, b int not null, foreign key (a, b) references foo (a, b));
                                                                    
 
CREATE TABLE
Duración: 2,559 ms

Now you have a normal index that you can reindex in the normal way, if you need
it.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Fri, 2020-09-04 at 13:31 -0400, Alvaro Herrera wrote:
> On 2020-Sep-04, Laurenz Albe wrote:
> > On Fri, 2020-09-04 at 10:41 -0400, Alvaro Herrera wrote:
> > > > The value I see in this is:
> > > > - replacing a primary key index
> > > > - replacing the index behind a constraint targeted by a foreign key
> > > But why is this better than using REINDEX CONCURRENTLY?
> > It is not better, but it can be used to replace a constraint index
> > with an index with a different INCLUDE clause, which is something
> > that cannot easily be done otherwise.
> 
> 
> I can see that there is value in having an index that serves both a
> uniqueness constraint and coverage purposes.  But this seems a pretty
> roundabout way to get that -- I think you should have to do "CREATE
> UNIQUE INDEX ... INCLUDING ..." instead.  That way, the fact that this
> is a Postgres extension remains clear.
> 
> 55432 14devel 24138=# create table foo (a int not null, b int not null, c int);
> CREATE TABLE
> Duración: 1,775 ms
> 55432 14devel 24138=# create unique index on foo (a, b) include (c);
> CREATE INDEX
> Duración: 1,481 ms
> 55432 14devel 24138=# create table bar (a int not null, b int not null, foreign key (a, b) references foo (a, b));
                                                                      
 
> CREATE TABLE
> Duración: 2,559 ms
> 
> Now you have a normal index that you can reindex in the normal way, if you need
> it.

Yes, that is true.

But what if you have done

  CREATE TABLE a (id bigint CONSTRAINT a_pkey PRIMARY KEY, val integer);
  CREATE TABLE b (id bigint CONSTRAINT b_fkey REFERENCES a);

and later you figure out later that it would actually be better to have
an index ON mytab (id) INCLUDE (val), and you don't want to maintain
two indexes.

Yes, you could do

  CREATE UNIQUE INDEX CONCURRENTLY ind ON a (id) INCLUDE (val);
  ALTER TABLE a ADD UNIQUE USING INDEX ind;
  ALTER TABLE a DROP CONSTRAINT a_pkey CASCADE;
  ALTER TABLE b ADD FOREIGN KEY (id) REFERENCES a(id);

but then you don't have a primary key, and you have to live without
the foreign key for a while.

Adding a primary key to a large table is very painful, because it
locks the table exclusively for a long time.


This patch would provide a more convenient way to do that.

Again, I am not sure if that justifies the effort.

Yours,
Laurenz Albe




On 2020-Sep-07, Laurenz Albe wrote:

> This patch would provide a more convenient way to do that.
> 
> Again, I am not sure if that justifies the effort.

I have to admit I've seen cases where it'd be useful to have included
columns in primary keys.

TBH I think if we really wanted the feature of primary keys with
included columns, we'd have to add it to the PRIMARY KEY syntax rather
than having an ad-hoc ALTER TABLE ALTER CONSTRAINT USING INDEX command
to replace the index underneath.  Then things like pg_dump would work
normally.

(I have an answer for the information_schema question Tom posed; I'd
like to know what's yours.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Mon, 2020-09-07 at 11:42 -0300, Alvaro Herrera wrote:
> > This patch would provide a more convenient way to do that.
> > Again, I am not sure if that justifies the effort.
> 
> I have to admit I've seen cases where it'd be useful to have included
> columns in primary keys.
> 
> TBH I think if we really wanted the feature of primary keys with
> included columns, we'd have to add it to the PRIMARY KEY syntax rather
> than having an ad-hoc ALTER TABLE ALTER CONSTRAINT USING INDEX command
> to replace the index underneath.  Then things like pg_dump would work
> normally.
> 
> (I have an answer for the information_schema question Tom posed; I'd
> like to know what's yours.)

Gah, now I see my mistake.  I was under the impression that a
primary key can have an INCLUDE clause today, which is not true.

So this would introduce that feature in a weird way.
I agree that that is undesirable.

We should at least have

  ALTER TABLE ... ADD PRIMARY KEY (id) INCLUDE (val);

or something before we consider this patch.

As to the information_schema, that could pretend that the INCLUDE
columns just don't exist.

Yours,
Laurenz Albe




On 2020-Sep-08, Laurenz Albe wrote:

> We should at least have
> 
>   ALTER TABLE ... ADD PRIMARY KEY (id) INCLUDE (val);
> 
> or something before we consider this patch.

Agreed.

Now the trick in this new command is to let the user change the included
columns afterwards, which remains useful (since it's clearly reasonable
to change minds after applications using the constraint start to take
shape).

> As to the information_schema, that could pretend that the INCLUDE
> columns just don't exist.

Yeah, that's what I was thinking too, since for all intents and
purposes, from the standard's POV the constraint works the same
regardless of included columns.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Wed, 2020-09-30 at 16:27 +0900, Michael Paquier wrote:
> On Fri, Sep 04, 2020 at 01:59:47PM +0200, Laurenz Albe wrote:
> > I'll set the commitfest entry to "waiting for author".
> 
> This review, as well as any of the follow-up emails, have not been
> answered by the author, so I have marked the patch as returned with
> feedback.

I had the impression that "rejected" would be more appropriate.

It doesn't make sense to introduce a featue whose only remaining
use case is modifying a constraint index in a way that is not
supported currently.

Yours,
Laurenz Albe