Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly - Mailing list pgsql-hackers

From Alexey Kondratov
Subject Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly
Date
Msg-id f3255cfc-a395-133a-17d1-5fca873e7b05@postgrespro.ru
Whole thread Raw
In response to Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Hi,

Thank you all for replies.

>> ALTER TABLE already has a lot of logic that is oriented towards being
>> able to do multiple things at the same time.  If we added CLUSTER,
>> VACUUM FULL, and REINDEX to that set, then you could, say, change a
>> data type, cluster, and change tablespaces all in a single SQL
>> command.
> That's a great observation.

Indeed, I thought that ALTER TABLE executes all actions sequentially one 
by one, e.g. in the case of

ALTER TABLE test_int CLUSTER ON test_int_idx, SET TABLESPACE test_tblspc;

it executes CLUSTER and THEN executes SET TABLESPACE. However, if I get 
it right, ALTER TABLE is rather smart, so in such a case it follows the 
steps:

1) Only saves new tablespace Oid during prepare phase 1 without actual work;

2) Only executes mark_index_clustered during phase 2, again without 
actual work done;

3) And finally rewrites relation during phase 3, where CLUSTER and SET 
TABLESPACE are effectively performed.

>> That would be cool, but probably a lot of work.  :-(
> But is it?  ALTER TABLE is already doing one kind of table rewrite
> during phase 3, and CLUSTER is just a different kind of table rewrite
> (which happens to REINDEX), and VACUUM FULL is just a special case of
> CLUSTER.  Maybe what we need is an ALTER TABLE variant that executes
> CLUSTER's table rewrite during phase 3 instead of its ad-hoc table
> rewrite.

According to the ALTER TABLE example above, it is already exist for CLUSTER.

> As for REINDEX, I think it's valuable to move tablespace together with
> the reindexing.  You can already do it with the CREATE INDEX
> CONCURRENTLY recipe we recommend, of course; but REINDEX CONCURRENTLY is
> not going to provide that, and it seems worth doing.

Maybe I am missing something, but according to the docs REINDEX 
CONCURRENTLY does not exist yet, DROP then CREATE CONCURRENTLY is 
suggested instead. Thus, we have to add REINDEX CONCURRENTLY first, but 
it is a matter of different patch, I guess.

>> Even for plain REINDEX that seems useful.
>> --
>> Michael

To summarize:

1) Alvaro and Michael agreed, that REINDEX with tablespace move may be 
useful. This is done in the patch attached to my initial email. Adding 
REINDEX to ALTER TABLE as new action seems quite questionable for me and 
not completely semantically correct. ALTER already looks bulky.

2) If I am correct, 'ALTER TABLE ... CLUSTER ON ..., SET TABLESPACE ...' 
does exactly what I wanted to add to CLUSTER in my patch. So probably no 
work is necessary here.

3) VACUUM FULL. It seems, that we can add special case 'ALTER TABLE ... 
VACUUM FULL, SET TABLESPACE ...', which will follow relatively the same 
path as with CLUSTER ON, but without any specific index. Relation should 
be rewritten in the new tablespace during phase 3.

What do you think?


Regards

-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



pgsql-hackers by date:

Previous
From: Marcus Mao
Date:
Subject: Re: PostgreSQL partition tables use more private memory
Next
From: Magnus Hagander
Date:
Subject: Re: global / super barriers (for checksums)