Thread: Add support for ALTER INDEX .. ALTER [COLUMN] col_num {SET,RESET}
Hi all, Since 911e702 (13~), it is possible to define opclass parameters for index attributes as of CREATE INDEX, but we lack an equivalent grammar for ALTER INDEX. I was looking at that, and it seems natural to me to do the same thing as what we do for SET STATISTICS, where we use a column number to decide which column should be used to set or reset a parameter. It happens that most of the infrastructure is already in place to allow that to work, I just had to push a bit the parser and tablecmds.c to handle that, thanks to the fact that opclass parameters are stored in pg_attribute in the same fashion as table parameters, where we use a simple text array for each param/value pair. The only tweak is to go through the correct validation option, aka index_opclass_options() (this was discussed on the thread that led to fdd8857). So this adds much more flexibility to the opclass handling for indexes. The attached does the work, with tests and documentation added to all the places I could think about while reviewing the existing opclass code for indexes. There is no need to worry about pg_dump, as opclass parameters are loaded with CREATE INDEX. I am adding that to the upcoming CF. Thoughts? Thanks, -- Michael
Attachment
On Fri, Oct 29, 2021 at 02:09:34PM +0900, Michael Paquier wrote: > The attached does the work, with tests and documentation added to all > the places I could think about while reviewing the existing opclass > code for indexes. There is no need to worry about pg_dump, as opclass > parameters are loaded with CREATE INDEX. By the way, while looking at this area of the code (particularly tsvector and gist), I was under the impression that it is safe to assume that we don't particularly need to rebuild the index when changing its attoptions this way, but I have also the feeling to be rather naive here. Alexander, you have much more experience in this area than I do, what are your thoughts on the matter? -- Michael
Attachment
On 10/29/21, 2:14 AM, "Michael Paquier" <michael@paquier.xyz> wrote: > By the way, while looking at this area of the code (particularly > tsvector and gist), I was under the impression that it is safe to > assume that we don't particularly need to rebuild the index when > changing its attoptions this way, but I have also the feeling to be > rather naive here. This is my concern as well. I remember seeing some weird errors in the thread that led to fdd8857, and I was able to reproduce it with this new patch. If you apply the following patch for the hstore test, it will begin failing. -- diff --git a/contrib/hstore/expected/hstore.out b/contrib/hstore/expected/hstore.out index 64a3272b9c..a4c78a40c8 100644 --- a/contrib/hstore/expected/hstore.out +++ b/contrib/hstore/expected/hstore.out @@ -1352,6 +1352,7 @@ create index hidx on testhstore using gist(h gist_hstore_ops(siglen=2025)); ERROR: value 2025 out of bounds for option "siglen" DETAIL: Valid values are between "1" and "2024". create index hidx on testhstore using gist(h gist_hstore_ops(siglen=2024)); +alter index hidx alter column 1 set (siglen=100); set enable_seqscan=off; select count(*) from testhstore where h @> 'wait=>NULL'; count diff --git a/contrib/hstore/sql/hstore.sql b/contrib/hstore/sql/hstore.sql index a59db66b0a..5e2a1ec99f 100644 --- a/contrib/hstore/sql/hstore.sql +++ b/contrib/hstore/sql/hstore.sql @@ -308,6 +308,7 @@ drop index hidx; create index hidx on testhstore using gist(h gist_hstore_ops(siglen=0)); create index hidx on testhstore using gist(h gist_hstore_ops(siglen=2025)); create index hidx on testhstore using gist(h gist_hstore_ops(siglen=2024)); +alter index hidx alter column 1 set (siglen=100); set enable_seqscan=off; select count(*) from testhstore where h @> 'wait=>NULL'; -- However, if you add a REINDEX right after the ALTER INDEX command, it will succeed. Nathan
Hi, Michael! On Fri, Oct 29, 2021 at 12:10 PM Michael Paquier <michael@paquier.xyz> wrote: > On Fri, Oct 29, 2021 at 02:09:34PM +0900, Michael Paquier wrote: > > The attached does the work, with tests and documentation added to all > > the places I could think about while reviewing the existing opclass > > code for indexes. There is no need to worry about pg_dump, as opclass > > parameters are loaded with CREATE INDEX. > > By the way, while looking at this area of the code (particularly > tsvector and gist), I was under the impression that it is safe to > assume that we don't particularly need to rebuild the index when > changing its attoptions this way, but I have also the feeling to be > rather naive here. > > Alexander, you have much more experience in this area than I do, what > are your thoughts on the matter? Good question, thanks. I keep the following use cases in my mind for this feature: 1) One could set up paths to index within the json document (using jsonpath or another language); 2) The way to index particular paths in json document: for instance, we can index string "as is" or just hash value; 3) For signatures, we may specify the number of bits to set in addition to the signature length. All these cases require index rebuild to change the opclass option. I can imagine there could be options, which don't require index rebuild to change. For instance, a split algorithm for R-tree-like GiST index could be changed on the fly. But not much point to change it on the existing index, though. We may introduce the additional property of opclass option "mutability", which specifies whether it's possible to change for the existing index. So, I don't think this is viable for now. I suggest postponing it. Leave the options immutable, until we get some solid use-cases of mutable options. ------ Regards, Alexander Korotkov
On Fri, Oct 29, 2021 at 11:15:08PM +0300, Alexander Korotkov wrote: > All these cases require index rebuild to change the opclass option. I > can imagine there could be options, which don't require index rebuild > to change. For instance, a split algorithm for R-tree-like GiST index > could be changed on the fly. But not much point to change it on the > existing index, though. This point alone is enough to say that what the patch does is not enough, then, thanks. An index rebuild would require an exclusive lock, but AT_SetOptions uses a share update exclusive lock. We could keep the same lock level by trigerring a concurrent reindex while forbidding running this command flavor on indexes within a transaction block, though. I guess that this still makes the life or users a bit easier to avoid mistakes with complicated index expressions or predicates, but that also feels like adding code for little gain. Another thing that could be done is to mark the index as invalid once its set of opclass parameters is updated. That would be simpler, while allowing users to fire a concurrent or non-concurrent rebuild at will after an ALTER INDEX. -- Michael
Attachment
On Fri, Oct 29, 2021 at 06:23:34PM +0000, Bossart, Nathan wrote: > This is my concern as well. I remember seeing some weird errors in > the thread that led to fdd8857, and I was able to reproduce it with > this new patch. If you apply the following patch for the hstore test, > it will begin failing. Indeed, thanks. That's the kind of things I was trying with other opclasses, actually. Missed this one. -- Michael
Attachment
On Sat, Oct 30, 2021 at 09:45:50AM +0900, Michael Paquier wrote: > Another thing that could be done is to mark the index as invalid once > its set of opclass parameters is updated. That would be simpler, > while allowing users to fire a concurrent or non-concurrent rebuild at > will after an ALTER INDEX. For the note here, I have looked at this possibility. And things become very tricky when it comes to indexes marked as either indisclustered or indisreplident. REINDEX CONCURRENTLY has the particularity to switch both parameters to false when swapping to the new index because we don't need the new index anymore. Invalid indexes can create with CREATE INDEX CONCURRENTLY (constraint failure at creation for example), but they won't have any of those flags set. So we assume now that indisinvalid is linked to both of them. This makes the problem mentioned upthread trickier than it looks, as we'd need to decide what to do after an ALTER INDEX that just switches an index to be invalid if any of these are set for the existing index. -- Michael