Thread: Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)
Hello! One more thing (maybe I missed it in the patch, but anyway) - should we add some migration to ensure what old databases will get enabled=true by default after upgrade? Best regards, Mikhail.
> Should this not behave like if you drop (or create) an index > during a prepared statement? I have not yet looked closely at > this code to see what could be done. > > Regards, I looked at this a bit more and ATExecEnableDisableIndex needs some tweaks. What should be getting invalidated in the heap relation that the index is on and not the index relation as it is in the current patch. You can retrieve the heap relation oid IndexGetRelation(indexOid, false) and the CacheInvalidateRelcache should be on the heap relation. The planner needs to only care about the heap relation invalidation to re-plan across multiple executions of a prepared statement. There should be a test for this scenario as well. Regards, Sami
On Mon, Dec 30, 2024 at 3:48 PM Michail Nikolaev <michail.nikolaev@gmail.com> wrote:
Hello!
One more thing (maybe I missed it in the patch, but anyway) - should we
add some migration to ensure what old databases will get enabled=true by
default after upgrade?
Hi!
Thank you! I tested this by manually upgrading (using pg_upgrade) from master to the build from the branch, which ensures that post-upgrade the column for indisenabled is true by default. I also backed it up with bool indisenabled BKI_DEFAULT(t); in pg_index.h. Additionally, I tested upgrading from an old data directory to the new one (both on this patch) to ensure indexes with DISABLE properties are carried over as well on the new data directory/upgrade. For reference the latest patch now is in [1].
Given this is working as expected, would we still need a migration step? (Let me know if I missed something ofc).
For reference here is the setup from my local testing (for reference)
rm -Rf /tmp/pg_data && rm -Rf /tmp/pg_data_new
./configure --prefix=/tmp/pg_install_old && make clean && make -j8 && make install
# Create and init old cluster
/tmp/pg_install_old/bin/initdb -D /tmp/pg_data
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data start
# Create test data
/tmp/pg_install_old/bin/createdb test
/tmp/pg_install_old/bin/psql test -c "CREATE TABLE foo (id int); CREATE INDEX idx_foo ON foo(id) DISABLE;"
# Stop old cluster
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data stop
# Switch branch and build new version
git checkout s/enable-disable-index
./configure --prefix=/tmp/pg_install_new && make clean && make -j8 && make install
# Create new cluster directory
/tmp/pg_install_new/bin/initdb -D /tmp/pg_data_new
# Now run upgrade with different binary locations
/tmp/pg_install_new/bin/pg_upgrade \
-b /tmp/pg_install_old/bin \
-B /tmp/pg_install_new/bin \
-d /tmp/pg_data \
-D /tmp/pg_data_new \
-p 5432 \
-P 5433
rm -Rf /tmp/pg_data && rm -Rf /tmp/pg_data_new
./configure --prefix=/tmp/pg_install_old && make clean && make -j8 && make install
# Create and init old cluster
/tmp/pg_install_old/bin/initdb -D /tmp/pg_data
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data start
# Create test data
/tmp/pg_install_old/bin/createdb test
/tmp/pg_install_old/bin/psql test -c "CREATE TABLE foo (id int); CREATE INDEX idx_foo ON foo(id) DISABLE;"
# Stop old cluster
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data stop
# Switch branch and build new version
git checkout s/enable-disable-index
./configure --prefix=/tmp/pg_install_new && make clean && make -j8 && make install
# Create new cluster directory
/tmp/pg_install_new/bin/initdb -D /tmp/pg_data_new
# Now run upgrade with different binary locations
/tmp/pg_install_new/bin/pg_upgrade \
-b /tmp/pg_install_old/bin \
-B /tmp/pg_install_new/bin \
-d /tmp/pg_data \
-D /tmp/pg_data_new \
-p 5432 \
-P 5433
/tmp/pg_install_new/bin/pg_ctl -D /tmp/pg_data_new start
$ SELECT * FROM pg_index WHERE indexrelid = 'idx_foo'::regclass;
$ SELECT * FROM pg_index WHERE indexrelid = 'idx_foo'::regclass;
Thank you
Shayon
Hello!
> Given this is working as expected, would we still need a migration step?
No, it is clear now. Thanks for explaining.
Best regards,
Mikhail.
+ This is the + default state for newly created indexes. This is not needed in the ALTER INDEX docs, IMO.ss + <para> + Disable the specified index. A disabled index is not used for queries, but it + is still updated when the underlying table data changes and will still be + used to enforce constraints (such as UNIQUE, or PRIMARY KEY constraints). + This can be useful for testing query performance with and without specific + indexes. If performance degrades after disabling an index, it can be easily + re-enabled using <literal>ENABLE</literal>. Before disabling, it's recommended + to check <structname>pg_stat_user_indexes</structname>.<structfield>idx_scan</structfield> + to identify potentially unused indexes. + </para> This got me thinking if dropping the index is the only use case we really care about. For example, you may want to prevent an index that is enforcing a constraint from being used by the planner, but you probably don't want to drop it. In fact, I also think that you may want the index from being used in one part of your application but could potentially benefit other parts of your application. In that case, I can see a GUC that allows you to force the use of a an index that has been CREATED or ALTERED as DISABLED. UNlike the GUC suggested earlier in the thread, this GUC can simply be a boolean to allow the force usage of a DISABLED index. FWIW, Oracle has a similar parameter called OPTIMIZER_USE_INVISIBLE_INDEXES. + underlying table data changes. This can be useful when you want to create + an index without immediately impacting query performance, allowing you to c/performance/planning ?? I have also been thinking about DISABLE as the keyword, and I really don't like it. DISABLE indicates, at least ot me, that the index is not available for either reads or writes. Looking at other engines, Sqlserver uses DISABLE to drop the index data, but keeps the index metadata around. Oracle uses INVISIBLE and MariabDB uses IGNORABLE to provide similar functionality to that being discussed here. I find those keywords to be more appropriate for this purpose. What about if we use HIDDEN instead of DISABLE as the keyword? Regards, Sami
I did some additional testing with this command within transactions. I had the "BEGIN; ALTER INDEX; EXPLAIN; ROLLBACK;" scenario in mind, but didn't realise we acquire an AccessExclusiveLock on the index. Therefore, it's not possible to change the visibility within a single transaction.... unless you don’t mind blocking all access to the relation. I read the comments at the top of "AlterTableGetLockLevel" and in the documentation and I understand that this behavior seems unavoidable. I suppose this is what was meant by the "globally visible effects" of an ALTER INDEX in the old discussion ? [1] Being able to rollback the changes is nice, but in this case there is not much to alter back anyway. This is probably not the intended use case (hence the discussion about GUCs and hints). [1] https://www.postgresql.org/message-id/30558.1529359929%40sss.pgh.pa.us
On Sat, Feb 8, 2025 at 12:41 AM jian he <jian.universality@gmail.com> wrote:
hi.
```
drop table if exists idxpart;
create table idxpart (a int, b int, c text) partition by range (a);
create table idxpart1 (like idxpart);
alter table idxpart attach partition idxpart1 for values from (0) to (10);
create index idxpart_c on only idxpart (c) invisible;
create index idxpart1_c on idxpart1 (c);
alter index idxpart_c attach partition idxpart1_c;
```
In this case, should ALTER INDEX ATTACH PARTITION change the attached
partition(idxpart1_c)'s "visible" status?
Hi,
That is a great question and I have really gone back and forth on this one and here's my reasoning so far
1. When you don't use ONLY:
- The index of child table inherits the visibility of the parent table's index
- This applies whether the parent index is set as INVISIBLE or VISIBLE
- This automatic inheritance is expected behavior and feels natural
2. When you use ONLY:
- You as a user/developer are explicitly taking control of index management
- Creating an index for parent as INVISIBLE and another for child as VISIBLE represents conscious, deliberate choices
- When attaching these indexes, it makes sense to respect these explicit visibility settings
- Silently overriding the child index's visibility could violate the Principle of Least Surprise
- Lastly, this model also allows more granular control over index visibility for each partition
I am not strongly tied to either of these options and very much open to changing my mind. Also happy to try and document this for more clarity.
I have rebased the patch on top of master (resolving some merge conflicts), along with the meson changes (thank you for that).
Thanks,
Shayon
That is a great question and I have really gone back and forth on this one and here's my reasoning so far
1. When you don't use ONLY:
- The index of child table inherits the visibility of the parent table's index
- This applies whether the parent index is set as INVISIBLE or VISIBLE
- This automatic inheritance is expected behavior and feels natural
2. When you use ONLY:
- You as a user/developer are explicitly taking control of index management
- Creating an index for parent as INVISIBLE and another for child as VISIBLE represents conscious, deliberate choices
- When attaching these indexes, it makes sense to respect these explicit visibility settings
- Silently overriding the child index's visibility could violate the Principle of Least Surprise
- Lastly, this model also allows more granular control over index visibility for each partition
I am not strongly tied to either of these options and very much open to changing my mind. Also happy to try and document this for more clarity.
I have rebased the patch on top of master (resolving some merge conflicts), along with the meson changes (thank you for that).
Thanks,
Shayon