Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly - Mailing list pgsql-hackers

From Jose Luis Tallon
Subject Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly
Date
Msg-id f3031524-9bcd-4078-b6b4-caffd076de97@adv-solutions.net
Whole thread Raw
In response to Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On 20/9/19 4:06, Michael Paquier wrote:
> On Thu, Sep 19, 2019 at 05:40:41PM +0300, Alexey Kondratov wrote:
>> On 19.09.2019 16:21, Robert Haas wrote:
>>> So, earlier in this thread, I suggested making this part of ALTER
>>> TABLE, and several people seemed to like that idea. Did we have a
>>> reason for dropping that approach?
> Personally, I don't find this idea very attractive as ALTER TABLE is
> already complicated enough with all the subqueries we already support
> in the command, all the logic we need to maintain to make combinations
> of those subqueries in a minimum number of steps, and also the number
> of bugs we have seen because of the amount of complication present.

Yes, but please keep the other options: At it is, cluster, vacuum full 
and reindex already rewrite the table in full; Being able to write the 
result to a different tablespace than the original object was stored in 
enables a whole world of very interesting possibilities.... including a 
quick way out of a "so little disk space available that vacuum won't 
work properly" situation --- which I'm sure MANY users will appreciate, 
including me

> If we add this option to REINDEX, then for 'ALTER TABLE tb_name action1,
> REINDEX SET TABLESPACE tbsp_name, action3' action2 will be just a direct
> alias to 'REINDEX TABLE tb_name SET TABLESPACE tbsp_name'. So it seems
> practical to do this for REINDEX first.
>
> The only one concern I have against adding REINDEX to ALTER TABLE in this
> context is that it will allow user to write such a chimera:
>
> ALTER TABLE tb_name REINDEX SET TABLESPACE tbsp_name, SET TABLESPACE
> tbsp_name;
>
> when they want to move both table and all the indexes. Because simple
> ALTER TABLE tb_name REINDEX, SET TABLESPACE tbsp_name;
> looks ambiguous. Should it change tablespace of table, indexes or both?

Indeed.

IMHO, that form of the command should not allow that much flexibility... 
even on the "principle of least surprise" grounds :S

That is, I'd restrict the ability to change (output) tablespace to the 
"direct" form --- REINDEX name, VACUUM (FULL) name, CLUSTER name --- 
whereas the ALTER table|index SET TABLESPACE would continue to work.

Now that I come to think of it, maybe saying "output" or "move to" 
rather than "set tablespace" would make more sense for this variation of 
the commands? (clearer, less prone to confusion)?

> Tricky question, but we don't change the tablespace of indexes when
> using an ALTER TABLE, so I would say no on compatibility grounds.
> ALTER TABLE has never touched the tablespace of indexes, and I don't
> think that we should begin to do so.

Indeed.


I might be missing something, but is there any reason to not *require* a 
explicit transaction for the above multi-action commands? I mean, have 
it be:

BEGIN;

ALTER TABLE tb_name SET TABLESPACE tbsp_name;    -- moves the table .... 
but possibly NOT the indexes?

ALTER TABLE tb_name REINDEX [OUTPUT TABLESPACE tbsp_name];    -- 
REINDEX, placing the resulting index on tbsp_name instead of the 
original one

COMMIT;

... and have the parser/planner combine the steps if it'd make sense (it 
probably wouldn't in this example)?


Just my .02€


Thanks,

     / J.L.





pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: psql - add SHOW_ALL_RESULTS option
Next
From: Richard Guo
Date:
Subject: Re: A problem about partitionwise join