Thread: Allow replacement of bloated primary key indexes without foreign key rebuilds

<div dir="ltr">Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT ... USING INDEX we added back in
theday is not so useful in the field. Having to drop foreign key constraints before this command, and recreate them
afterwardsmakes this command useless to most database setups. I feel sorry that no one brought this up when we were
implementingthe feature; maybe we could've done something about it right then.<br /><br />I wish to correct it now, and
didsome research. Long story short, I realized that the foreign key constraint depends on the index relation of the
primarykey, and does not depend on the constraint object as I had expected (Please see rows 5 and 11 of the result set
shownbelow). This behaviour is also seen when the FKey references a unique constraint. As much as that perplexes me, I
thinkit makes our job a bit easier.<br /><br />All we need to do is allow swapping of pg_class.relfilenode of two
indexes.This will let the dependency entries stand as they are and allow us to drop the bloated primary key index
structurewithout having to rebuild the foreign key constraints.<br /><br />As for the syntactical sugar, this can be
addedto either ALTER TABLE or to ALTER INDEX. Although under no normal circumstances one would need to use ALTER INDEX
toswap two indexes' relfilenode (because one can easily create a duplicate index and drop/rename-in-place the old one),
Ithink it would make more sense here since it is just an operation on two indexes and has nothing to do with the
constraints,apart from the fact that we want to use this feature to meddle with the constraints.<br /><br />Syntax
options:<br/><br />ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE} USING INDEX new_index;<br /><br
/>ALTERINDEX ind REPLACE WITH new_index;<br /><br />Note that in both the syntaxes, it is assumed that all remnants of
new_indexwill be gone after the command completes successfully; that is, the commands will behave as if they deleted
theindex structure of the index being replaced and placed the new structure in its place, while dropping the index that
wasused for the replacement.<br /><br />I don't think we need to ensure that the new_index is completely flushed to
diskbefore the operation, but we do need to issue relevant cache invalidation messages after the operation is done.<br
/><br/>For replacement to be successful, new_index should not be associated with any constraints, and, new_index should
beidentical to the index being replaced, except for the index names.<br /><br />The ALTER TABLE syntax closely emulates
theexisting syntax of replacing a constraint using an existing index, but looking at the grammar construction I feel
thatit may be more complex to implement than the ALTER INDEX syntax.<br /><br /> ALTER INDEX feels easier to do, since
wewon't have to jump through hoops like in ALTER TABLE's multi-command support (ATExec*() functions), and dropping the
new_indexmight be easier to do.<br /><br />Thoughts?<br /><br /><span style="font-family:courier
new,monospace">postgres=#\d+ test</span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace">                         Table "public.test"</span><br
style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> Column |  Type   |
Modifiers| Storage | Stats target | Description </span><br style="font-family:courier new,monospace" /><span
style="font-family:courier
new,monospace">--------+---------+-----------+---------+--------------+-------------</span><br
style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> a      | integer | not
null | plain   |              | </span><br style="font-family:courier new,monospace" /><span style="font-family:courier
new,monospace"> b     | integer |           | plain   |              | </span><br style="font-family:courier
new,monospace"/><span style="font-family:courier new,monospace">Indexes:</span><br style="font-family:courier
new,monospace"/><span style="font-family:courier new,monospace">    "test_pkey" PRIMARY KEY, btree (a)</span><br
style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace">    "temp_idx" UNIQUE, btree
(a)</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"></span><span
style="font-family:couriernew,monospace">Referenced by:</span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace">    TABLE "test2" CONSTRAINT "test2_b_fkey" FOREIGN KEY (b) REFERENCES
test(a)</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace">Has OIDs:
no</span><brstyle="font-family:courier new,monospace" /><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace">postgres=# \d+ test2</span><br style="font-family:courier new,monospace"
/><spanstyle="font-family:courier new,monospace">                        Table "public.test2"</span><br
style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> Column |  Type   |
Modifiers| Storage | Stats target | Description </span><br style="font-family:courier new,monospace" /><span
style="font-family:courier
new,monospace">--------+---------+-----------+---------+--------------+-------------</span><br
style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> a      | integer
|          | plain   |              | </span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace"> b      | integer |           | plain   |              | </span><br
style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace">Foreign-key
constraints:</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace">   
"test2_b_fkey"FOREIGN KEY (b) REFERENCES test(a)</span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace">Has OIDs: no</span><br style="font-family:courier new,monospace" /><br
/>Relevantoutput of query [4] on pg_depend:<br /><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace">    classid    | objid |            objid             |  refclassid   |
refobjid|   refobjid   | deptype </span><br style="font-family:courier new,monospace" /><span
style="font-family:courier
new,monospace">---------------+-------+------------------------------+---------------+----------+--------------+---------<br
/> pg_class     | 16413 | test                         | pg_namespace  |     2200 | public       | n</span><br
style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> pg_type       | 16415 |
test                        | pg_class      |    16413 | test         | i</span><br style="font-family:courier
new,monospace"/><span style="font-family:courier new,monospace"> pg_type       | 16414 | test[]                       |
pg_type      |    16415 | test         | i</span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace"> pg_constraint | 16417 | test_pkey                    | pg_class      |   
16413| test         | a</span><br style="font-family:courier new,monospace" /><span style="font-family:courier
new,monospace"> pg_class     | 16416 | test_pkey                    | pg_constraint |    16417 | test_pkey    |
i</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_class      |
16418| test2                        | pg_namespace  |     2200 | public       | n</span><br style="font-family:courier
new,monospace"/><span style="font-family:courier new,monospace"> pg_type       | 16420 | test2                        |
pg_class     |    16418 | test2        | i</span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace"> pg_type       | 16419 | test2[]                      | pg_type       |   
16420| test2        | i</span><br style="font-family:courier new,monospace" /><span style="font-family:courier
new,monospace"> pg_constraint| 16421 | test2_b_fkey                 | pg_class      |    16413 | test         |
n</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_constraint |
16421| test2_b_fkey                 | pg_class      |    16418 | test2        | a</span><br style="font-family:courier
new,monospace"/><span style="font-family:courier new,monospace"> pg_constraint | 16421 | test2_b_fkey                 |
pg_class     |    16416 | test_pkey    | n</span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace"> pg_trigger    | 16422 | RI_ConstraintTrigger_a_16422 | pg_constraint |   
16421| test2_b_fkey | i</span><br style="font-family:courier new,monospace" /><span style="font-family:courier
new,monospace"> pg_trigger   | 16423 | RI_ConstraintTrigger_a_16423 | pg_constraint |    16421 | test2_b_fkey |
i</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_trigger    |
16424| RI_ConstraintTrigger_c_16424 | pg_constraint |    16421 | test2_b_fkey | i</span><br style="font-family:courier
new,monospace"/><span style="font-family:courier new,monospace"> pg_trigger    | 16425 | RI_ConstraintTrigger_c_16425 |
pg_constraint|    16421 | test2_b_fkey | i</span><br style="font-family:courier new,monospace" /><span
style="font-family:couriernew,monospace"></span><br />[1] <a
href="http://archives.postgresql.org/pgsql-general/2012-07/msg00104.php">http://archives.postgresql.org/pgsql-general/2012-07/msg00104.php</a><br
/>[2]<a
href="http://archives.postgresql.org/pgsql-general/2012-07/msg00105.php">http://archives.postgresql.org/pgsql-general/2012-07/msg00105.php</a><br
/>[3] <a
href="http://archives.postgresql.org/pgsql-general/2012-07/msg00110.php">http://archives.postgresql.org/pgsql-general/2012-07/msg00110.php</a><br
clear="all"/>[4] select classid::regclass, objid, case classid::regclass::text when 'pg_class' then
objid::regclass::textwhen 'pg_type' then objid::regtype::text when 'pg_constraint' then (select conname from
pg_constraintwhere oid = objid) when 'pg_namespace' then (select nspname from pg_namespace where oid = objid) when
'pg_trigger'then (select tgname from pg_trigger where oid = objid) else objid::text end, refclassid::regclass,
refobjid,case refclassid::regclass::text when 'pg_class' then refobjid::regclass::text when 'pg_type' then
refobjid::regtype::textwhen 'pg_constraint' then (select conname from pg_constraint where oid = refobjid) when
'pg_namespace'then (select nspname from pg_namespace where oid = refobjid) when 'pg_trigger' then (select tgname from
pg_triggerwhere oid = refobjid) else refobjid::text end, deptype from pg_depend;<br /><br />-- <br /><div
dir="ltr">GurjeetSingh<br />EnterpriseDB Corporation<br />The Enterprise PostgreSQL Company<br /></div><br /></div> 
On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> All we need to do is allow swapping of pg_class.relfilenode of two indexes.
> This will let the dependency entries stand as they are and allow us to drop
> the bloated primary key index structure without having to rebuild the
> foreign key constraints.

Fwiw I don't like swapping relfilenodes on indexes the user created.
REINDEX currently does this but it's a bit of a hack and only works
because reindex carefully builds the new index with exactly the same
definition as the old one.

The problem you describe is one of constraints and dependencies and
not one of indexes. It seems what you really want is a way to alter
foreign key dependencies to depend on a new index. Either an explicit
command that lets you set the new dependency or what seems even better
would be to have DROP INDEX check any dependent objects to see if
there's another index that can satisfy them and change their
dependency.

These might suffer from deadlock problems but hopefully they could be
manageable since it's not a frequent operation and there aren't any
other operations that rejigger dependencies.

-- 
greg


Greg Stark <stark@mit.edu> writes:
> On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
>> All we need to do is allow swapping of pg_class.relfilenode of two indexes.

> Fwiw I don't like swapping relfilenodes on indexes the user created.
> REINDEX currently does this but it's a bit of a hack and only works
> because reindex carefully builds the new index with exactly the same
> definition as the old one.

Yes.  The swap-relfilenodes operation would have to carefully check that
the index definitions were exactly equivalent, and there would be a
constant risk for bugs of omission if that code weren't taught about
any new index properties we invent.

> The problem you describe is one of constraints and dependencies and
> not one of indexes. It seems what you really want is a way to alter
> foreign key dependencies to depend on a new index. Either an explicit
> command that lets you set the new dependency or what seems even better
> would be to have DROP INDEX check any dependent objects to see if
> there's another index that can satisfy them and change their
> dependency.

Either of these have exactly the same issue, namely their correctness
depends on determining if two indexes have identical properties.

All of these things seem like ugly, hard-to-use kluges anyway (the
make-sure-the-indexes-match business is just as much of a PITA for the
DBA as it is for the system).  What we really want is REINDEX
CONCURRENTLY.
        regards, tom lane


On Tue, Jul 10, 2012 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The problem you describe is one of constraints and dependencies and
>> not one of indexes. It seems what you really want is a way to alter
>> foreign key dependencies to depend on a new index. Either an explicit
>> command that lets you set the new dependency or what seems even better
>> would be to have DROP INDEX check any dependent objects to see if
>> there's another index that can satisfy them and change their
>> dependency.
>
> Either of these have exactly the same issue, namely their correctness
> depends on determining if two indexes have identical properties.

This doesn't sound right to me. In these cases all it would have to
know about is the same set of properties that CREATE CONSTRAINT looks
for to find a satisfactory index to depend on.

-- 
greg


Excerpts from Tom Lane's message of mar jul 10 10:44:03 -0400 2012:

> All of these things seem like ugly, hard-to-use kluges anyway (the
> make-sure-the-indexes-match business is just as much of a PITA for the
> DBA as it is for the system).  What we really want is REINDEX
> CONCURRENTLY.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9961 ?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Tom Lane's message of mar jul 10 10:44:03 -0400 2012:
>> What we really want is REINDEX CONCURRENTLY.

> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9961 ?

Hm ... that paper looks like something we might want to incorporate into
btree's VACUUM processing, but it's not very on-point if someone really
wants to rebuild the index totally.
        regards, tom lane


On Tue, Jul 10, 2012 at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Greg Stark <stark@mit.edu> writes:
> On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
>> All we need to do is allow swapping of pg_class.relfilenode of two indexes.

> Fwiw I don't like swapping relfilenodes on indexes the user created.
> REINDEX currently does this but it's a bit of a hack and only works
> because reindex carefully builds the new index with exactly the same
> definition as the old one.

Yes.  The swap-relfilenodes operation would have to carefully check that
the index definitions were exactly equivalent, and there would be a
constant risk for bugs of omission if that code weren't taught about
any new index properties we invent.

IMHO there must be many other places in this code-base where we run that risk.

The way I am planning to do it was to compare all relevant fields of the FormData_pg_index. And I am assuming anybody changing the struct members will take care of relevant changes needed for this code too.

We can add a runtime/compile-time assert to make sure that Natts_pg_index==17. That way, if a new column gets added, we will get alerted promptly.
 
All of these things seem like ugly, hard-to-use kluges anyway (the
make-sure-the-indexes-match business is just as much of a PITA for the
DBA as it is for the system).  What we really want is REINDEX
CONCURRENTLY.

+1, but I can't take on that task.

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

<div dir="ltr"><div class="gmail_quote">On Tue, Jul 10, 2012 at 11:11 AM, Greg Stark <span dir="ltr"><<a
href="mailto:stark@mit.edu"target="_blank">stark@mit.edu</a>></span> wrote:<br /><blockquote class="gmail_quote"
style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Tue, Jul 10, 2012 at 3:44 PM,
TomLane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br /> >> The problem you describe
isone of constraints and dependencies and<br /> >> not one of indexes. It seems what you really want is a way to
alter<br/> >> foreign key dependencies to depend on a new index. Either an explicit<br /> >> command that
letsyou set the new dependency or what seems even better<br /> >> would be to have DROP INDEX check any dependent
objectsto see if<br /> >> there's another index that can satisfy them and change their<br /> >>
dependency.<br/> ><br /> > Either of these have exactly the same issue, namely their correctness<br /> >
dependson determining if two indexes have identical properties.<br /><br /></div>This doesn't sound right to me. In
thesecases all it would have to<br /> know about is the same set of properties that CREATE CONSTRAINT looks<br /> for
tofind a satisfactory index to depend on.<br /><span class="HOEnZb"><font
color="#888888"></font></span></blockquote></div><brclear="all" />I like the DROP index idea, but the silent
side-effectmay not make people happy. Can you give me a pointer to relevant code.<br /><br />-- <br /><div
dir="ltr">GurjeetSingh<br />EnterpriseDB Corporation<br />The Enterprise PostgreSQL Company<br /></div><br /></div>