Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly
Date
Msg-id 20200406184406.GF2228@telsasoft.com
Whole thread Raw
In response to Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
Responses Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
List pgsql-hackers
On Mon, Apr 06, 2020 at 08:43:46PM +0300, Alexey Kondratov wrote:
> Thanks for the input, but I am afraid that the patch set became a bit messy
> now. I have eyeballed it and found some inconsistencies.
> 
>      const char *name;            /* name of database to reindex */
> -    int            options;        /* Reindex options flags */
> +    List        *rawoptions;        /* Raw options */
> +    int        options;            /* Parsed options */
>      bool        concurrent;        /* reindex concurrently? */
> 
> You introduced rawoptions in the 0002, but then removed it in 0003. So is it
> required or not? Probably this is a rebase artefact.

You're right; I first implemented REINDEX() and when I later did CLUSTER(), I
did it better, so I went back and did REINDEX() that way, but it looks like I
maybe fixup!ed the wrong commit.  Fixed now.

> +/* XXX: reusing reindex_option_list */
> +            | CLUSTER opt_verbose '(' reindex_option_list ')' qualified_name
> cluster_index_specification
> 
> Could we actually simply reuse vac_analyze_option_list? From the first sight
> it does just the right thing, excepting the special handling of spelling
> ANALYZE/ANALYSE, but it does not seem to be a problem.

Hm, do you mean to let cluster.c reject the other options like "analyze" ?
I'm not sure why that would be better than reusing reindex?
I think the suggestion will probably be to just copy+paste the reindex option
list and rename it to cluster (possibly with the explanation that they're
separate and independant and so their behavior shouldn't be tied together).

> > 0004 reduces duplicative error handling, as a separate commit so
> > Alexey can review it and/or integrate it.
> 
> ReindexRelationConcurrently is used for all cases, but it hits different
> code paths in the case of database, table and index. I have not checked yet,
> but are you sure it is safe removing these validations in the case of
> REINDEX CONCURRENTLY?

You're right about the pg_global case, fixed.  System catalogs can't be
reindexed CONCURRENTLY, so they're already caught by that check.

> > XXX: for cluster/vacuum, it might be more friendly to check before
> > clustering
> > the table, rather than after clustering and re-indexing.
> 
> Yes, I think it would be much more user-friendly.

I realized it's not needed or useful to check indexes in advance of clustering,
since 1) a mapped index will be on a mapped relation, which is already checked;
2) a system index will be on a system relation.  Right ?

-- we already knew that
ts=# SELECT COUNT(1) FROM pg_index i JOIN pg_class a ON i.indrelid=a.oid JOIN pg_class b ON i.indexrelid=b.oid WHERE
a.relnamespace!=b.relnamespace;
count | 0

-- not true in general, but true here and true for system relations
ts=# SELECT COUNT(1) FROM pg_index i JOIN pg_class a ON i.indrelid=a.oid JOIN pg_class b ON i.indexrelid=b.oid WHERE
a.reltablespace!= b.reltablespace;
 
count | 0

-- 
Justin

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: where should I stick that backup?
Next
From: Tom Lane
Date:
Subject: Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding