Thread: Proposal to Enable/Disable Index using ALTER INDEX
Hello hackers,
This is my first time posting here, and I’d like to propose a new feature related to PostgreSQL indexes. If this idea resonates, I’d be happy to follow up with a patch as well.
Problem:
Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
Proposal:
I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;
A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.
Implementation:
To keep this simple, I suggest toggling the indisvalid flag in pg_index during the enable/disable operation.
Additional Considerations:
- Keeping the index up-to-date while it’s disabled seems preferable, as it avoids the need to rebuild the index if it’s re-enabled later. The alternative would be dropping and rebuilding the index upon re-enabling, which I believe would introduce additional overhead in terms of application logic & complexity.
- I am also proposing to reuse the existing indisvalid flag to avoid adding new state and the maintenance that comes with it, but I’m open to feedback if this approach has potential downsides.
- To keep the scope minimal for now, I propose that we only allow enabling and disabling indexes globally, and not locally, by supporting it exclusively in ALTER INDEX. I would love to know if this would break any SQL grammar promises that I might be unaware of.
I would love to learn if this sounds like a good idea and how it can be improved further. Accordingly, as a next step I would be very happy to propose a patch as well.
Best regards,
Shayon Mukherjee
This is my first time posting here, and I’d like to propose a new feature related to PostgreSQL indexes. If this idea resonates, I’d be happy to follow up with a patch as well.
Problem:
Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
Proposal:
I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;
A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.
Implementation:
To keep this simple, I suggest toggling the indisvalid flag in pg_index during the enable/disable operation.
Additional Considerations:
- Keeping the index up-to-date while it’s disabled seems preferable, as it avoids the need to rebuild the index if it’s re-enabled later. The alternative would be dropping and rebuilding the index upon re-enabling, which I believe would introduce additional overhead in terms of application logic & complexity.
- I am also proposing to reuse the existing indisvalid flag to avoid adding new state and the maintenance that comes with it, but I’m open to feedback if this approach has potential downsides.
- To keep the scope minimal for now, I propose that we only allow enabling and disabling indexes globally, and not locally, by supporting it exclusively in ALTER INDEX. I would love to know if this would break any SQL grammar promises that I might be unaware of.
I would love to learn if this sounds like a good idea and how it can be improved further. Accordingly, as a next step I would be very happy to propose a patch as well.
Best regards,
Shayon Mukherjee
On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote: > Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can beresource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal sinceas a user you may want a quicker way to test their effects without fully committing to removing & adding them back again.Which can be a time taking operation on larger tables. > > Proposal: > I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look somethinglike: > > ALTER INDEX index_name ENABLE; > ALTER INDEX index_name DISABLE; > > A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allowsusers to assess whether an index impacts query performance before deciding to drop it entirely. I personally think having some way to alter an index to stop it from being used in query plans would be very useful for the reasons you mentioned. I don't have any arguments against the syntax you've proposed. We'd certainly have to clearly document that constraints are still enforced. Perhaps there is some other syntax which would self-document slightly better. I just can't think of it right now. > Implementation: > To keep this simple, I suggest toggling the indisvalid flag in pg_index during the enable/disable operation. That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be used to make valid a failed concurrently created index. I think this would need a new flag and everywhere in the planner would need to be adjusted to ignore indexes when that flag is false. > Additional Considerations: > - Keeping the index up-to-date while it’s disabled seems preferable, as it avoids the need to rebuild the index if it’sre-enabled later. The alternative would be dropping and rebuilding the index upon re-enabling, which I believe wouldintroduce additional overhead in terms of application logic & complexity. I think the primary use case here is to assist in dropping useless indexes in a way that can very quickly be undone if the index is more useful than thought. If you didn't keep the index up-to-date then that would make the feature useless for that purpose. If we get the skip scan feature for PG18, then there's likely going to be lots of people with indexes that they might want to consider removing after upgrading. Maybe this is a good time to consider this feature as it possibly won't ever be more useful than it will be after we get skip scans. David
Hi Shayon
Thank you for your work on this , I think it's great to have this feature implemented ,I checked the doucment on other databases,It seems both MySQL 8.0 and oracle supports it, sql server need to rebuild indexes after disabled,It seems disable the index, it's equivalent to deleting the index, except that the index's metadata is still retained:
->A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess -> ->whether an index impacts query performance before deciding to drop it entirely.
MySQL 8.0 and oracle settings are not visible, index information is always updated, I would then suggest that the statement be changed to set the index invisible and visible.
Thanks
David Rowley <dgrowleyml@gmail.com> 于2024年9月10日周二 06:17写道:
On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote:
> Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
>
> Proposal:
> I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:
>
> ALTER INDEX index_name ENABLE;
> ALTER INDEX index_name DISABLE;
>
> A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.
I personally think having some way to alter an index to stop it from
being used in query plans would be very useful for the reasons you
mentioned. I don't have any arguments against the syntax you've
proposed. We'd certainly have to clearly document that constraints
are still enforced. Perhaps there is some other syntax which would
self-document slightly better. I just can't think of it right now.
> Implementation:
> To keep this simple, I suggest toggling the indisvalid flag in pg_index during the enable/disable operation.
That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be
used to make valid a failed concurrently created index. I think this
would need a new flag and everywhere in the planner would need to be
adjusted to ignore indexes when that flag is false.
> Additional Considerations:
> - Keeping the index up-to-date while it’s disabled seems preferable, as it avoids the need to rebuild the index if it’s re-enabled later. The alternative would be dropping and rebuilding the index upon re-enabling, which I believe would introduce additional overhead in terms of application logic & complexity.
I think the primary use case here is to assist in dropping useless
indexes in a way that can very quickly be undone if the index is more
useful than thought. If you didn't keep the index up-to-date then that
would make the feature useless for that purpose.
If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.
David
Hi, On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote: > On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote: > > Adding and removing indexes is a common operation in PostgreSQL. On > > larger databases, however, these operations can be > > resource-intensive. When evaluating the performance impact of one or > > more indexes, dropping them might not be ideal since as a user you > > may want a quicker way to test their effects without fully > > committing to removing & adding them back again. Which can be a time > > taking operation on larger tables. > > > > Proposal: > > I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look somethinglike: > > > > ALTER INDEX index_name ENABLE; > > ALTER INDEX index_name DISABLE; > > > > A disabled index would still receive updates and enforce constraints > > as usual but would not be used for queries. This allows users to > > assess whether an index impacts query performance before deciding to > > drop it entirely. > > I personally think having some way to alter an index to stop it from > being used in query plans would be very useful for the reasons you > mentioned. I don't have any arguments against the syntax you've > proposed. We'd certainly have to clearly document that constraints > are still enforced. Perhaps there is some other syntax which would > self-document slightly better. I just can't think of it right now. > > > Implementation: > > To keep this simple, I suggest toggling the indisvalid flag in > > pg_index during the enable/disable operation. > > That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be > used to make valid a failed concurrently created index. I think this > would need a new flag and everywhere in the planner would need to be > adjusted to ignore indexes when that flag is false. How about the indislive flag instead? I haven't looked at the code, but from the documentation ("If false, the index is in process of being dropped, and should be ignored for all purposes") it sounds like we made be able to piggy-back on that instead? Michael
On Tue, 10 Sept 2024 at 22:46, Michael Banck <mbanck@gmx.net> wrote: > How about the indislive flag instead? I haven't looked at the code, but > from the documentation ("If false, the index is in process of being > dropped, and > should be ignored for all purposes") it sounds like we made be able to > piggy-back on that instead? Doing that could cause an UPDATE which would ordinarily not be eligible for a HOT-update to become a HOT-update. That would cause issues if the index is enabled again as the index wouldn't have been updated during the UPDATE. I don't see the big deal with adding a new flag. There's even a free padding byte to put this flag in after indisreplident, so we don't have to worry about using more memory. David
+1 for the new flag as well, since it'd be nice to be able to enable/disable indexes without having to worry about the missed updates / having to rebuild it.
Shayon
On Tue, Sep 10, 2024 at 8:02 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 10 Sept 2024 at 22:46, Michael Banck <mbanck@gmx.net> wrote:
> How about the indislive flag instead? I haven't looked at the code, but
> from the documentation ("If false, the index is in process of being
> dropped, and
> should be ignored for all purposes") it sounds like we made be able to
> piggy-back on that instead?
Doing that could cause an UPDATE which would ordinarily not be
eligible for a HOT-update to become a HOT-update. That would cause
issues if the index is enabled again as the index wouldn't have been
updated during the UPDATE.
I don't see the big deal with adding a new flag. There's even a free
padding byte to put this flag in after indisreplident, so we don't
have to worry about using more memory.
David
Kind Regards,
Shayon Mukherjee
Hello,
Thank you for the detailed information and feedback David. Comments inline.
P.S Re-sending it to the mailing list, because I accidentally didn't select reply-all on the last reply.
On Mon, Sep 9, 2024 at 6:16 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote:
> Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
>
> Proposal:
> I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:
>
> ALTER INDEX index_name ENABLE;
> ALTER INDEX index_name DISABLE;
>
> A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.
I personally think having some way to alter an index to stop it from
being used in query plans would be very useful for the reasons you
mentioned. I don't have any arguments against the syntax you've
proposed. We'd certainly have to clearly document that constraints
are still enforced. Perhaps there is some other syntax which would
self-document slightly better. I just can't think of it right now.
Thank you and likewise. I was thinking of piggy backing off of VALID / NOT VALID, but that might have similar issues (if not more confusion) to the current proposed syntax. Will be sure to update the documentation.
> Implementation:
> To keep this simple, I suggest toggling the indisvalid flag in pg_index during the enable/disable operation.
That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be
used to make valid a failed concurrently created index. I think this
would need a new flag and everywhere in the planner would need to be
adjusted to ignore indexes when that flag is false.
That is a great call and I wasn't thinking of the semantics with the existing usage of concurrently created indexes.
> Additional Considerations:
> - Keeping the index up-to-date while it’s disabled seems preferable, as it avoids the need to rebuild the index if it’s re-enabled later. The alternative would be dropping and rebuilding the index upon re-enabling, which I believe would introduce additional overhead in terms of application logic & complexity.
I think the primary use case here is to assist in dropping useless
indexes in a way that can very quickly be undone if the index is more
useful than thought. If you didn't keep the index up-to-date then that
would make the feature useless for that purpose.
+1
If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.
David
Thank you for the feedback again, I will look into the changes required and accordingly propose a PATCH.
Kind Regards,
Shayon Mukherjee
On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote: > I think the primary use case here is to assist in dropping useless > indexes in a way that can very quickly be undone if the index is more > useful than thought. If you didn't keep the index up-to-date then that > would make the feature useless for that purpose. > > If we get the skip scan feature for PG18, then there's likely going to > be lots of people with indexes that they might want to consider > removing after upgrading. Maybe this is a good time to consider this > feature as it possibly won't ever be more useful than it will be after > we get skip scans. +1, this is something I've wanted for some time. There was some past discussion, too [0]. [0] https://postgr.es/m/flat/ed8c9ed7-bb5d-aaec-065b-ad4893645deb%402ndQuadrant.com -- nathan
On Wed, 11 Sept 2024 at 03:12, Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote: > > If we get the skip scan feature for PG18, then there's likely going to > > be lots of people with indexes that they might want to consider > > removing after upgrading. Maybe this is a good time to consider this > > feature as it possibly won't ever be more useful than it will be after > > we get skip scans. > > +1, this is something I've wanted for some time. There was some past > discussion, too [0]. > > [0] https://postgr.es/m/flat/ed8c9ed7-bb5d-aaec-065b-ad4893645deb%402ndQuadrant.com Thanks for digging that up. I'd forgotten about that. I see there was pushback from having this last time, which is now over 6 years ago. In the meantime, we still have nothing to make this easy for people. I think the most important point I read in that thread is [1]. Maybe what I mentioned in [2] is a good workaround. Additionally, I think there will need to be syntax in CREATE INDEX for this. Without that pg_get_indexdef() might return SQL that does not reflect the current state of the index. MySQL seems to use "CREATE INDEX name ON table (col) [VISIBLE|INVISIBLE]". David [1] https://www.postgresql.org/message-id/20180618215635.m5vrnxdxhxytvmcm%40alap3.anarazel.de [2] https://www.postgresql.org/message-id/CAKJS1f_L7y_BTGESp5Qd6BSRHXP0mj3x9O9C_U27GU478UwpBw%40mail.gmail.com
On Mon, 23 Sept 2024 at 05:43, Shayon Mukherjee <shayonj@gmail.com> wrote: > - Modified get_index_paths() and build_index_paths() to exclude disabled > indexes from consideration during query planning. There are quite a large number of other places you also need to modify. Here are 2 places where the index should be ignored but isn't: 1. expression indexes seem to still be used for statistical estimations: create table b as select generate_series(1,1000)b; create index on b((b%10)); analyze b; explain select distinct b%10 from b; -- HashAggregate (cost=23.00..23.12 rows=10 width=4) alter index b_expr_idx disable; explain select distinct b%10 from b; -- HashAggregate (cost=23.00..23.12 rows=10 width=4) <-- should be 1000 rows drop index b_expr_idx; explain select distinct b%10 from b; -- HashAggregate (cost=23.00..35.50 rows=1000 width=4) 2. Indexes seem to still be used for join removals. create table c (c int primary key); explain select c1.* from c c1 left join c c2 on c1.c=c2.c; -- correctly removes join. alter index c_pkey disable; explain select c1.* from c c1 left join c c2 on c1.c=c2.c; -- should not remove join. Please carefully look over all places that RelOptInfo.indexlist is looked at and consider skipping disabled indexes. Please also take time to find SQL that exercises each of those places so you can verify that the behaviour is correct after your change. This is also a good way to learn exactly all cases where indexes are used. Using this method would have led you to find places like rel_supports_distinctness(), where you should be skipping disabled indexes. The planner should not be making use of disabled indexes for any optimisations at all. > - catversion.h is updated with a new CATALOG_VERSION_NO to reflect change in pg_index > schema. Please leave that up to the committer. Patch authors doing this just results in the patch no longer applying as soon as someone commits a version bump. Also, please get rid of these notices. The command tag serves that purpose. It's not interesting that the index is already disabled. # alter index a_pkey disable; NOTICE: index "a_pkey" is now disabled ALTER INDEX # alter index a_pkey disable; NOTICE: index "a_pkey" is already disabled ALTER INDEX I've only given the code a very quick glance. I don't quite understand why you're checking the index is enabled in create_index_paths() and get_index_paths(). I think the check should be done only in create_index_paths(). Primarily, you'll find code such as "if (index->indpred != NIL && !index->predOK)" in the locations you need to consider skipping the disabled index. I think your new code should be located very close to those places or perhaps within the same if condition unless it makes it overly complex for the human reader. I think the documents should also mention that disabling an index is a useful way to verify an index is not being used before dropping it as the index can be enabled again at the first sign that performance has been effected. (It might also be good to mention that checking pg_stat_user_indexes.idx_scan should be the first port of call when checking for unused indexes) David
Hi David, Thank you so much for the review and pointers. I totally missed expression indexes. I am going to do another proper passalong with your feedback and follow up with an updated patch and any questions. Excited to be learning so much about the internals. Shayon > On Sep 22, 2024, at 6:44 PM, David Rowley <dgrowleyml@gmail.com> wrote: > > On Mon, 23 Sept 2024 at 05:43, Shayon Mukherjee <shayonj@gmail.com> wrote: >> - Modified get_index_paths() and build_index_paths() to exclude disabled >> indexes from consideration during query planning. > > There are quite a large number of other places you also need to modify. > > Here are 2 places where the index should be ignored but isn't: > > 1. expression indexes seem to still be used for statistical estimations: > > create table b as select generate_series(1,1000)b; > create index on b((b%10)); > analyze b; > explain select distinct b%10 from b; > -- HashAggregate (cost=23.00..23.12 rows=10 width=4) > > alter index b_expr_idx disable; > explain select distinct b%10 from b; > -- HashAggregate (cost=23.00..23.12 rows=10 width=4) <-- should be 1000 rows > > drop index b_expr_idx; > explain select distinct b%10 from b; > -- HashAggregate (cost=23.00..35.50 rows=1000 width=4) > > 2. Indexes seem to still be used for join removals. > > create table c (c int primary key); > explain select c1.* from c c1 left join c c2 on c1.c=c2.c; -- > correctly removes join. > alter index c_pkey disable; > explain select c1.* from c c1 left join c c2 on c1.c=c2.c; -- should > not remove join. > > Please carefully look over all places that RelOptInfo.indexlist is > looked at and consider skipping disabled indexes. Please also take > time to find SQL that exercises each of those places so you can verify > that the behaviour is correct after your change. This is also a good > way to learn exactly all cases where indexes are used. Using this > method would have led you to find places like > rel_supports_distinctness(), where you should be skipping disabled > indexes. > > The planner should not be making use of disabled indexes for any > optimisations at all. > >> - catversion.h is updated with a new CATALOG_VERSION_NO to reflect change in pg_index >> schema. > > Please leave that up to the committer. Patch authors doing this just > results in the patch no longer applying as soon as someone commits a > version bump. > > Also, please get rid of these notices. The command tag serves that > purpose. It's not interesting that the index is already disabled. > > # alter index a_pkey disable; > NOTICE: index "a_pkey" is now disabled > ALTER INDEX > # alter index a_pkey disable; > NOTICE: index "a_pkey" is already disabled > ALTER INDEX > > I've only given the code a very quick glance. I don't quite understand > why you're checking the index is enabled in create_index_paths() and > get_index_paths(). I think the check should be done only in > create_index_paths(). Primarily, you'll find code such as "if > (index->indpred != NIL && !index->predOK)" in the locations you need > to consider skipping the disabled index. I think your new code should > be located very close to those places or perhaps within the same if > condition unless it makes it overly complex for the human reader. > > I think the documents should also mention that disabling an index is a > useful way to verify an index is not being used before dropping it as > the index can be enabled again at the first sign that performance has > been effected. (It might also be good to mention that checking > pg_stat_user_indexes.idx_scan should be the first port of call when > checking for unused indexes) > > David
On 09.09.24 23:38, Shayon Mukherjee wrote: > *Problem*: > Adding and removing indexes is a common operation in PostgreSQL. On > larger databases, however, these operations can be resource-intensive. > When evaluating the performance impact of one or more indexes, dropping > them might not be ideal since as a user you may want a quicker way to > test their effects without fully committing to removing & adding them > back again. Which can be a time taking operation on larger tables. > > *Proposal*: > I propose adding an ALTER INDEX command that allows for enabling or > disabling an index globally. This could look something like: > > ALTER INDEX index_name ENABLE; > ALTER INDEX index_name DISABLE; > > A disabled index would still receive updates and enforce constraints as > usual but would not be used for queries. This allows users to assess > whether an index impacts query performance before deciding to drop it > entirely. I think a better approach would be to make the list of disabled indexes a GUC setting, which would then internally have an effect similar to enable_indexscan, meaning it would make the listed indexes unattractive to the planner. This seems better than the proposed DDL command, because you'd be able to use this per-session, instead of forcing a global state, and even unprivileged users could use it. (I think we have had proposals like this before, but I can't find the discussion I'm thinking of right now.)
That's a good point. +1 for the idea of the GUC setting, especially since, as you mentioned, it allows unprivileged users to access it and beingper-session.. I am happy to draft a patch for this as well. I think I have a working idea so far of where the necessary checks might go.However if you don’t mind, can you elaborate further on how the effect would be similar to enable_indexscan? I was thinking we could introduce a new GUC option called `disabled_indexes` and perform a check against in all places foreach index being considered with its OID via get_relname_relid through a helper function in the various places we needto prompt the planner to not use the index (like in indxpath.c as an example). Curious to learn if you have a different approach in mind perhaps? Thank you, Shayon > On Sep 23, 2024, at 11:14 AM, Peter Eisentraut <peter@eisentraut.org> wrote: > > On 09.09.24 23:38, Shayon Mukherjee wrote: >> *Problem*: >> Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can beresource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal sinceas a user you may want a quicker way to test their effects without fully committing to removing & adding them back again.Which can be a time taking operation on larger tables. >> *Proposal*: >> I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look somethinglike: >> ALTER INDEX index_name ENABLE; >> ALTER INDEX index_name DISABLE; >> A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. Thisallows users to assess whether an index impacts query performance before deciding to drop it entirely. > > I think a better approach would be to make the list of disabled indexes a GUC setting, which would then internally havean effect similar to enable_indexscan, meaning it would make the listed indexes unattractive to the planner. > > This seems better than the proposed DDL command, because you'd be able to use this per-session, instead of forcing a globalstate, and even unprivileged users could use it. > > (I think we have had proposals like this before, but I can't find the discussion I'm thinking of right now.) >
I found an old thread here [0].
Also, a question: If we go with the GUC approach, how do we expect `pg_get_indexdef` to behave?
I suppose it would behave no differently than it otherwise would, because there's no new SQL grammar to support and, given its GUC status, it seems reasonable that `pg_get_indexdef` doesn’t reflect whether an index is enabled or not.
If so, then I wonder if using a dedicated `ALTER` command and keeping the state in `pg_index` would be better for consistency's sake?
Thank you
Shayon
On Sep 23, 2024, at 4:51 PM, Shayon Mukherjee <shayonj@gmail.com> wrote:That's a good point.
+1 for the idea of the GUC setting, especially since, as you mentioned, it allows unprivileged users to access it and being per-session..
I am happy to draft a patch for this as well. I think I have a working idea so far of where the necessary checks might go. However if you don’t mind, can you elaborate further on how the effect would be similar to enable_indexscan?
I was thinking we could introduce a new GUC option called `disabled_indexes` and perform a check against in all places for each index being considered with its OID via get_relname_relid through a helper function in the various places we need to prompt the planner to not use the index (like in indxpath.c as an example).
Curious to learn if you have a different approach in mind perhaps?
Thank you,
ShayonOn Sep 23, 2024, at 11:14 AM, Peter Eisentraut <peter@eisentraut.org> wrote:
On 09.09.24 23:38, Shayon Mukherjee wrote:*Problem*:
Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
*Proposal*:
I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;
A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.
I think a better approach would be to make the list of disabled indexes a GUC setting, which would then internally have an effect similar to enable_indexscan, meaning it would make the listed indexes unattractive to the planner.
This seems better than the proposed DDL command, because you'd be able to use this per-session, instead of forcing a global state, and even unprivileged users could use it.
(I think we have had proposals like this before, but I can't find the discussion I'm thinking of right now.)
On Tue, 24 Sept 2024 at 03:14, Peter Eisentraut <peter@eisentraut.org> wrote: > > On 09.09.24 23:38, Shayon Mukherjee wrote: > > ALTER INDEX index_name ENABLE; > > ALTER INDEX index_name DISABLE; > > I think a better approach would be to make the list of disabled indexes > a GUC setting, which would then internally have an effect similar to > enable_indexscan, meaning it would make the listed indexes unattractive > to the planner. I understand the last discussion went down that route too. For me, it seems strange that adding some global variable is seen as cleaner than storing the property in the same location as all the other index properties. How would you ensure no cached plans are still using the index after changing the GUC? > This seems better than the proposed DDL command, because you'd be able > to use this per-session, instead of forcing a global state, and even > unprivileged users could use it. That's true. > (I think we have had proposals like this before, but I can't find the > discussion I'm thinking of right now.) I think it's the one that was already linked by Nathan. [1]? The GUC seems to have been first suggested on the same thread in [2]. David [1] https://www.postgresql.org/message-id/ed8c9ed7-bb5d-aaec-065b-ad4893645deb%402ndQuadrant.com [2] https://www.postgresql.org/message-id/29800.1529359024%40sss.pgh.pa.us
On Mon, Sep 23, 2024 at 8:31 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 24 Sept 2024 at 03:14, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 09.09.24 23:38, Shayon Mukherjee wrote:
> > ALTER INDEX index_name ENABLE;
> > ALTER INDEX index_name DISABLE;
>
> I think a better approach would be to make the list of disabled indexes
> a GUC setting, which would then internally have an effect similar to
> enable_indexscan, meaning it would make the listed indexes unattractive
> to the planner.
I understand the last discussion went down that route too. For me, it
seems strange that adding some global variable is seen as cleaner than
storing the property in the same location as all the other index
properties.
That was my first instinct as well. Although, being able to control this setting on a per session level and as an unprivileged user is somewhat attractive.
How would you ensure no cached plans are still using the index after
changing the GUC?
Could we call ResetPlanCache() to invalidate all plan caches from the assign_ hook on GUC when it's set (and doesn't match the old value). Something like this (assuming the GUC is called `disabled_indexes`)
void
assign_disabled_indexes(const char *newval, void *extra)
{
if (disabled_indexes != newval)
ResetPlanCache();
}
assign_disabled_indexes(const char *newval, void *extra)
{
if (disabled_indexes != newval)
ResetPlanCache();
}
A bit heavy-handed, but perhaps it's OK, since it's not meant to be used frequently also ?
> This seems better than the proposed DDL command, because you'd be able
> to use this per-session, instead of forcing a global state, and even
> unprivileged users could use it.
That's true.
> (I think we have had proposals like this before, but I can't find the
> discussion I'm thinking of right now.)
I think it's the one that was already linked by Nathan. [1]? The GUC
seems to have been first suggested on the same thread in [2].
David
[1] https://www.postgresql.org/message-id/ed8c9ed7-bb5d-aaec-065b-ad4893645deb%402ndQuadrant.com
[2] https://www.postgresql.org/message-id/29800.1529359024%40sss.pgh.pa.us
If one of the use cases is soft-dropping indexes, would a GUC approach still support that? ALTER TABLE?
On 23.09.24 22:51, Shayon Mukherjee wrote: > I am happy to draft a patch for this as well. I think I have a working > idea so far of where the necessary checks might go. However if you don’t > mind, can you elaborate further on how the effect would be similar to > enable_indexscan? Planner settings like enable_indexscan used to just add a large number (disable_cost) to the estimated plan node costs. It's a bit more sophisticated in PG17. But in any case, I imagine "disabling an index" could use the same mechanism. Or maybe not, maybe the setting would just completely ignore the index.
On 24.09.24 02:30, David Rowley wrote: > I understand the last discussion went down that route too. For me, it > seems strange that adding some global variable is seen as cleaner than > storing the property in the same location as all the other index > properties. It's arguably not actually a property of the index, it's a property of the user's session. Like, kind of, the search path is a session property, not a property of a schema. > How would you ensure no cached plans are still using the index after > changing the GUC? Something for the patch author to figure out. ;-)
Peter Eisentraut <peter@eisentraut.org> writes: > On 23.09.24 22:51, Shayon Mukherjee wrote: >> I am happy to draft a patch for this as well. I think I have a working >> idea so far of where the necessary checks might go. However if you don’t >> mind, can you elaborate further on how the effect would be similar to >> enable_indexscan? > Planner settings like enable_indexscan used to just add a large number > (disable_cost) to the estimated plan node costs. It's a bit more > sophisticated in PG17. But in any case, I imagine "disabling an index" > could use the same mechanism. Or maybe not, maybe the setting would > just completely ignore the index. Yeah, I'd be inclined to implement this by having create_index_paths just not make any paths for rejected indexes. Or you could back up another step and keep plancat.c from building IndexOptInfos for them. The latter might have additional effects, such as preventing the plan from relying on a uniqueness condition enforced by the index. Not clear to me if that's desirable or not. [ thinks... ] One good reason for implementing it in plancat.c is that you'd have the index relation open and be able to see its name for purposes of matching to the filter. Anywhere else, getting the name would involve additional overhead. regards, tom lane
Thank you for the historical context and working, I understand what you were referring to before now. Shayon > On Sep 24, 2024, at 2:08 PM, Peter Eisentraut <peter@eisentraut.org> wrote: > > On 23.09.24 22:51, Shayon Mukherjee wrote: >> I am happy to draft a patch for this as well. I think I have a working >> idea so far of where the necessary checks might go. However if you don’t >> mind, can you elaborate further on how the effect would be similar to >> enable_indexscan? > > Planner settings like enable_indexscan used to just add a large number (disable_cost) to the estimated plan node costs. It's a bit more sophisticated in PG17. But in any case, I imagine "disabling an index" could use the same mechanism. Or maybe not, maybe the setting would just completely ignore the index.
Hello,
Also added this as a post in Commit Fest [0]
Thank you
Shayon
On Sep 26, 2024, at 1:39 PM, Shayon Mukherjee <shayonj@gmail.com> wrote:Hello,
I am back with a PATCH :). Thanks to everyone in the threads for all the helpful discussions.
This proposal is for a PATCH to introduce a GUC variable to disable specific indexes during query planning.
This is an alternative approach to the previous PATCH I had proposed and is improved upon after some of the recent discussions in the thread. The PATCH contains the relevant changes, regression tests, and documentation.
I went with the GUC approach to introduce a way for a user to disable indexes during query planning over dedicated SQL Grammar and introducing the `isenabled` attribute in `pg_index` for the following reasons:
- Inspired by the discussions brought in earlier about this setting being something that unprivileged users can benefit from versus an ALTER statement.
- A GUC variable felt more closely aligned with the query tuning purpose, which this feature would serve, over index maintenance, the state of which is more closely reflected in `pg_index`.
Implementation details:
The patch introduces a new GUC parameter `disabled_indexes` that allows users to specify a comma-separated list of indexes to be ignored during query planning. Key aspects:
- Adds a new `isdisabled` attribute to the `IndexOptInfo` structure.
- Modifies `get_relation_info` in `plancat.c` to skip disabled indexes entirely, thus reducing the number of places we need to check if an index is disabled or not.
- Implements GUC hooks for parameter validation and assignment.
- Resets the plan cache when the `disabled_indexes` list is modified through `ResetPlanCache()`
I chose to modify the logic within `get_relation_info` as compared to, say, reducing the cost to make the planner not consider an index during planning, mostly to keep the number of changes being introduced to a minimum and also the logic itself being self-contained and easier to under perhaps (?).
As mentioned before, this does not impact the building of the index. That still happens.
I have added regression tests for:
- Basic single-column and multi-column indexes
- Partial indexes
- Expression indexes
- Join indexes
- GIN and GiST indexes
- Covering indexes
- Range indexes
- Unique indexes and constraints
I'd love to hear any feedback on the proposed PATCH and also the overall approach.
<v1-0001-Ability-to-enable-disable-indexes-through-GUC.patch>On Sep 24, 2024, at 9:19 AM, Shayon Mukherjee <shayonj@gmail.com> wrote:
--
Kind Regards,
Shayon Mukherjee
<v1-0001-Proof-of-Concept-Ability-to-enable-disable-indexe.patch>
On Mon, Sep 23, 2024 at 11:14 AM Peter Eisentraut <peter@eisentraut.org> wrote: > I think a better approach would be to make the list of disabled indexes > a GUC setting, which would then internally have an effect similar to > enable_indexscan, meaning it would make the listed indexes unattractive > to the planner. > > This seems better than the proposed DDL command, because you'd be able > to use this per-session, instead of forcing a global state, and even > unprivileged users could use it. > > (I think we have had proposals like this before, but I can't find the > discussion I'm thinking of right now.) I feel like a given user could want either one of these things. If you've discovered that a certain index is causing your production application to pick the wrong index, disabling it and thereby affecting all backends is what you want. If you're trying to experiment with different query plans without changing anything for other backends, being able to set some session-local state is better. I don't understand the argument that one of these is categorically better than the other. -- Robert Haas EDB: http://www.enterprisedb.com
> On Oct 7, 2024, at 4:52 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Sep 23, 2024 at 11:14 AM Peter Eisentraut <peter@eisentraut.org> wrote: >> I think a better approach would be to make the list of disabled indexes >> a GUC setting, which would then internally have an effect similar to >> enable_indexscan, meaning it would make the listed indexes unattractive >> to the planner. >> >> This seems better than the proposed DDL command, because you'd be able >> to use this per-session, instead of forcing a global state, and even >> unprivileged users could use it. >> >> (I think we have had proposals like this before, but I can't find the >> discussion I'm thinking of right now.) > > I feel like a given user could want either one of these things. If > you've discovered that a certain index is causing your production > application to pick the wrong index, disabling it and thereby > affecting all backends is what you want. If you're trying to > experiment with different query plans without changing anything for > other backends, being able to set some session-local state is better. > I don't understand the argument that one of these is categorically > better than the other. Makes sense to me and it’s something I am somewhat split on as well. I suppose with a GUC you can still do some thing like ALTER USER foobar SET disabled_indexes to ‘idx_test_table_id’ [thinking…] This way all new sessions will start to not consider the index when query planning. Of course it does not helpexisting sessions, so one may need to kill those backends, which could be heavy handed. Both these options clearly serve slightly different purposes with good pros and I am currently thinking if GUC is that goodmiddle ground solution. Curious if someone has a stronger opinion on which one of these might make more sense perhaps :-D. [thinking…] Unless - we try to do support both a GUC and the ALTER INDEX ENABLE/DISABLE grammar + isdisabled attribute onpg_index? I can see that both implementations (GUC and the new attribute on pg_index via ALTER) have the primary logic managed by `get_relation_info`in `plancat.c`. Here, we set `isdisabled` (new attribute) on `IndexOptInfo` and compare it against `disabled_indexes`in the GUC (from the previous GUC patch). Similarly, for `pg_index`, which is already open in `get_relation_info`,we can read from `pg_index.isdisabled` and accordingly update `IndexOptInfo.isdisabled`. [0] https://www.postgresql.org/message-id/6CE345C1-6FFD-4E4C-8775-45DA659C57CF@gmail.com Thanks Shayon
On Wed, 9 Oct 2024 at 20:07, Shayon Mukherjee <shayonj@gmail.com> wrote: > [thinking…] Unless - we try to do support both a GUC and the ALTER INDEX ENABLE/DISABLE grammar + isdisabled attributeon pg_index? I just wanted to explain my point of view on this. This is my opinion and is by no means authoritative. I was interested in this patch when you proposed it as an ALTER INDEX option. I know other committers seem interested, but I personally don't have any interest in the GUC option. I think the reason I dislike it is that it's yet another not even half-baked take on planner hints (the other one being enable* GUCs). I often thought that if we ever did planner hints that it would be great to have multiple ways to specify the hints. Ordinarily, I'd expect some special comment type as the primary method to specify hints, but equally, it would be nice to be able to specify them in other ways. e.g. a GUC to have them apply to more than just 1 query. Useful for things such as "don't use index X". Now, I'm not suggesting you go off and code up planner hints. That's a huge project. I'm just concerned that we've already got a fair bit of cruft that will be left remaining if we ever get core planner hints and a disabled_indexes GUC will just add to that. I don't feel like the ALTER INDEX method would be leftover cruft from us gaining core planner hints. Others might feel differently on that one. I feel the ALTER INDEX option is less controversial. I'll also stand by what I said earlier on this thread. If PeterG gets index skip scans done for PG18, then it's likely there's going to be lots of users considering if they still need a certain index or not after upgrading to PG18. David
On Wed, Oct 9, 2024 at 4:19 AM David Rowley <dgrowleyml@gmail.com> wrote: > On Wed, 9 Oct 2024 at 20:07, Shayon Mukherjee <shayonj@gmail.com> wrote: > > [thinking…] Unless - we try to do support both a GUC and the ALTER INDEX ENABLE/DISABLE grammar + isdisabled attributeon pg_index? > > I just wanted to explain my point of view on this. This is my opinion > and is by no means authoritative. > > I was interested in this patch when you proposed it as an ALTER INDEX > option. I know other committers seem interested, but I personally > don't have any interest in the GUC option. I think the reason I > dislike it is that it's yet another not even half-baked take on > planner hints (the other one being enable* GUCs). I often thought that > if we ever did planner hints that it would be great to have multiple > ways to specify the hints. Ordinarily, I'd expect some special comment > type as the primary method to specify hints, but equally, it would be > nice to be able to specify them in other ways. e.g. a GUC to have them > apply to more than just 1 query. Useful for things such as "don't use > index X". +1. A GUC can be done as a contrib module using existing hooks, and I think that's already been done outside of core, perhaps multiple times. That certainly doesn't mean we CAN'T add it as an in-core feature, but I do think "yet another not even half-baked take on planner hints" is a fair description. What I would personally like to see is for us to ship one or possibly more than one contrib module that let people do hint-like things in useful ways, and this could be a part of that. But I think we need better infrastructure for controlling the planner behavior first, hence the "allowing extensions to control planner behavior" thread. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Oct 9, 2024 at 1:41 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Oct 9, 2024 at 4:19 AM David Rowley <dgrowleyml@gmail.com> wrote:
> On Wed, 9 Oct 2024 at 20:07, Shayon Mukherjee <shayonj@gmail.com> wrote:
> > [thinking…] Unless - we try to do support both a GUC and the ALTER INDEX ENABLE/DISABLE grammar + isdisabled attribute on pg_index?
>
> I just wanted to explain my point of view on this. This is my opinion
> and is by no means authoritative.
>
> I was interested in this patch when you proposed it as an ALTER INDEX
> option. I know other committers seem interested, but I personally
> don't have any interest in the GUC option. I think the reason I
> dislike it is that it's yet another not even half-baked take on
> planner hints (the other one being enable* GUCs). I often thought that
> if we ever did planner hints that it would be great to have multiple
> ways to specify the hints. Ordinarily, I'd expect some special comment
> type as the primary method to specify hints, but equally, it would be
> nice to be able to specify them in other ways. e.g. a GUC to have them
> apply to more than just 1 query. Useful for things such as "don't use
> index X".
+1. A GUC can be done as a contrib module using existing hooks, and I
think that's already been done outside of core, perhaps multiple
times. That certainly doesn't mean we CAN'T add it as an in-core
feature, but I do think "yet another not even half-baked take on
planner hints" is a fair description. What I would personally like to
see is for us to ship one or possibly more than one contrib module
that let people do hint-like things in useful ways, and this could be
a part of that. But I think we need better infrastructure for
controlling the planner behavior first, hence the "allowing extensions
to control planner behavior" thread.
What's the strategy here in this discussion?
This topic is older than PostgreSQL itself - everytime WE talk about so-called "hints" we see procrastination in the name of trademarks.
Lack of definition of what can and can't be done with hooks and what is the infra-estructural code that is necessary to allow it from core.
Take for example the need of disabling an index. What does it mean practically and for which component of the code?
You are going to disable the index but not the update of it? Why? Does it imply that when you are going to re-enable it you are going to recreate it?
So that's more observed from the point of syntax and facilities.
Also distributed into partitions and why not the opposite: the creation of a global index for all partitions. Also in discussion elsewhere.
Otherwise it will appear that people will need hints and contribs to "outsource" the main role of what is strategic to the team to companies elsewhere.
Regards,
Vinícius
Hi David,
Answered below
Thank you so much this context, as someone new to psql-hackers, having this insight is super useful. Also getting a sense of how folks feel about controlling different behaviors like planner hints through GUC and SQL grammar.
On Oct 9, 2024, at 9:19 AM, David Rowley <dgrowleyml@gmail.com> wrote:On Wed, 9 Oct 2024 at 20:07, Shayon Mukherjee <shayonj@gmail.com> wrote:[thinking…] Unless - we try to do support both a GUC and the ALTER INDEX ENABLE/DISABLE grammar + isdisabled attribute on pg_index?
I just wanted to explain my point of view on this. This is my opinion
and is by no means authoritative.
I was interested in this patch when you proposed it as an ALTER INDEX
option. I know other committers seem interested, but I personally
don't have any interest in the GUC option. I think the reason I
dislike it is that it's yet another not even half-baked take on
planner hints (the other one being enable* GUCs). I often thought that
if we ever did planner hints that it would be great to have multiple
ways to specify the hints. Ordinarily, I'd expect some special comment
type as the primary method to specify hints, but equally, it would be
nice to be able to specify them in other ways. e.g. a GUC to have them
apply to more than just 1 query. Useful for things such as "don't use
index X".
For instance: I wasn’t quite able to figure out the how to properly distinguish + reason between the enable* GUCs and ALTER index for this case, and patches are per my limited understand of the historical context as well.
Now, I'm not suggesting you go off and code up planner hints. That's a
huge project. I'm just concerned that we've already got a fair bit of
cruft that will be left remaining if we ever get core planner hints
and a disabled_indexes GUC will just add to that. I don't feel like
the ALTER INDEX method would be leftover cruft from us gaining core
planner hints. Others might feel differently on that one. I feel the
ALTER INDEX option is less controversial.
I'll also stand by what I said earlier on this thread. If PeterG gets
index skip scans done for PG18, then it's likely there's going to be
lots of users considering if they still need a certain index or not
after upgrading to PG18.
Likewise, I personally feel that the ability to disable indexes quickly and reverse the disabling (also quickly) is super useful, especially from an operational POV (point of view). So, I am very keen on getting this landed and happy to iterate on as many patches as it takes. :D
At this point, I am indifferent to each of the approaches (GUC or SQL grammar) based on the pros/cons I shared earlier in the thread & discussions in the thread. However, I would like us to make progress on getting _something_ out since the topic of disabling indexes has come up many times on pgsql-hackers in the past years and there is no easy way to toggle this behavior yet.
“yet another not even half-baked take on planner hints" is a good way to put things about enable* GUCs, so I am very much on board with proposing an updated PATCH to support disabling of indexes through ALTER. The original PATCH was here for context [1].
I am also curious about supporting this ([1]) through the ALTER grammar and not having the planner consider indexes by updating `get_relation_info` in `plancat.c`. Basically, through `pg_index.isdisabled`, which is already open in `get_relation_info`, we can read from `pg_index.isdisabled` and accordingly update `IndexOptInfo.isdisabled`. So, I'm happy to explore that as well and share my findings.
Thanks
Shayon
On Oct 9, 2024, at 1:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:On Wed, Oct 9, 2024 at 4:19 AM David Rowley <dgrowleyml@gmail.com> wrote:On Wed, 9 Oct 2024 at 20:07, Shayon Mukherjee <shayonj@gmail.com> wrote:[thinking…] Unless - we try to do support both a GUC and the ALTER INDEX ENABLE/DISABLE grammar + isdisabled attribute on pg_index?
I just wanted to explain my point of view on this. This is my opinion
and is by no means authoritative.
I was interested in this patch when you proposed it as an ALTER INDEX
option. I know other committers seem interested, but I personally
don't have any interest in the GUC option. I think the reason I
dislike it is that it's yet another not even half-baked take on
planner hints (the other one being enable* GUCs). I often thought that
if we ever did planner hints that it would be great to have multiple
ways to specify the hints. Ordinarily, I'd expect some special comment
type as the primary method to specify hints, but equally, it would be
nice to be able to specify them in other ways. e.g. a GUC to have them
apply to more than just 1 query. Useful for things such as "don't use
index X".
+1. A GUC can be done as a contrib module using existing hooks, and I
think that's already been done outside of core, perhaps multiple
times. That certainly doesn't mean we CAN'T add it as an in-core
feature, but I do think "yet another not even half-baked take on
planner hints" is a fair description. What I would personally like to
see is for us to ship one or possibly more than one contrib module
that let people do hint-like things in useful ways, and this could be
a part of that. But I think we need better infrastructure for
controlling the planner behavior first, hence the "allowing extensions
to control planner behavior" thread.
- Support disabling of indexes [1] through ALTER command
- While also building on "allowing extensions to control planner behavior” for the reasons above?
Thanks
Shayon
On Sat, Oct 12, 2024 at 5:56 AM Shayon Mukherjee <shayonj@gmail.com> wrote: > Thank you for sharing this Robert. I like the idea behind "allowing extensions to control planner behavior” overall andI think it does help towards a powerful extension ecosystem too. I wonder if there is a reality where we can achieve boththe outcomes here > > - Support disabling of indexes [1] through ALTER command > - While also building on "allowing extensions to control planner behavior” for the reasons above? > > [1] https://www.postgresql.org/message-id/ABD42A12-4DCF-4EE4-B903-4C657903CECE%40gmail.com Yes, I think we can do both things. -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, 12 Oct 2024 at 22:41, Vinícius Abrahão <vinnix.bsd@gmail.com> wrote: > You are going to disable the index but not the update of it? Why? Does it imply that when you are going to re-enable ityou are going to recreate it? It might be worth you reading the discussion and proposed patches. I think either of those would answer your questions. I don't recall anyone ever proposing that re-enabling the index would result in it having to be rebuilt. If that was a requirement, then I'd say there does not seem much point in the feature. You might as well just drop the index and recreate it if you change your mind. David
On Wed, 16 Oct 2024 at 03:40, Robert Haas <robertmhaas@gmail.com> wrote: > > On Sat, Oct 12, 2024 at 5:56 AM Shayon Mukherjee <shayonj@gmail.com> wrote: > > Thank you for sharing this Robert. I like the idea behind "allowing extensions to control planner behavior” overall andI think it does help towards a powerful extension ecosystem too. I wonder if there is a reality where we can achieve boththe outcomes here > > > > - Support disabling of indexes [1] through ALTER command > > - While also building on "allowing extensions to control planner behavior” for the reasons above? > > > > [1] https://www.postgresql.org/message-id/ABD42A12-4DCF-4EE4-B903-4C657903CECE%40gmail.com > > Yes, I think we can do both things. I think so too. I imagine there'd be cases where even hints global to all queries running on the server wouldn't result in the index being completely disabled. For example, a physical replica might not be privy to the hints defined on the primary and it might just be the queries running on the physical replica that are getting the most use out of the given index. Having the change made in pg_index would mean physical replicas have the index disabled too. For the primary use case I have in mind (test disabling indexes you're considering dropping), having the disabledness replicate would be very useful. David
On 24.09.24 20:21, Tom Lane wrote: > Peter Eisentraut <peter@eisentraut.org> writes: >> On 23.09.24 22:51, Shayon Mukherjee wrote: >>> I am happy to draft a patch for this as well. I think I have a working >>> idea so far of where the necessary checks might go. However if you don’t >>> mind, can you elaborate further on how the effect would be similar to >>> enable_indexscan? > >> Planner settings like enable_indexscan used to just add a large number >> (disable_cost) to the estimated plan node costs. It's a bit more >> sophisticated in PG17. But in any case, I imagine "disabling an index" >> could use the same mechanism. Or maybe not, maybe the setting would >> just completely ignore the index. > > Yeah, I'd be inclined to implement this by having create_index_paths > just not make any paths for rejected indexes. Or you could back up > another step and keep plancat.c from building IndexOptInfos for them. > The latter might have additional effects, such as preventing the plan > from relying on a uniqueness condition enforced by the index. Not > clear to me if that's desirable or not. Yes, I think you'd want that, because one of the purposes of this feature would be to test whether it's okay to drop an index. So you don't want the planner to take any account of the index for its decisions.
On Wed, Oct 16, 2024 at 8:33 AM Peter Eisentraut <peter@eisentraut.org> wrote: > > Yeah, I'd be inclined to implement this by having create_index_paths > > just not make any paths for rejected indexes. Or you could back up > > another step and keep plancat.c from building IndexOptInfos for them. > > The latter might have additional effects, such as preventing the plan > > from relying on a uniqueness condition enforced by the index. Not > > clear to me if that's desirable or not. > > Yes, I think you'd want that, because one of the purposes of this > feature would be to test whether it's okay to drop an index. So you > don't want the planner to take any account of the index for its decisions. I think this is right. I think we want to avoid invalidating the index, so we still need to consider it in determining where HOT updates must be performed, but we don't want to "improve" the plan using the index if it's disabled. -- Robert Haas EDB: http://www.enterprisedb.com
On Oct 16, 2024, at 2:15 PM, Shayon Mukherjee <shayonj@gmail.com> wrote:On Oct 16, 2024, at 12:19 PM, Shayon Mukherjee <shayonj@gmail.com> wrote:- ALTER INDEX ... ENABLE/DISABLE performs an in-place update of the pg_indexcatalog to protect against indcheckxmin [2] (older unrelated thread).Performing the in place update of the pg_index row from ATExecEnableDisableIndex using systable_inplace_update_begin was failing in CI weirdly but not on my local MacBook when running spec suite. I am also wondering what is causing the requirement [1] to update the row in-place vs say using CatalogTupleUpdate since using the later is working well locally + CI?
I believe I somewhat understand why the issue might be occurring, where CI is failing the create_index regression test and crashing (signal 6 in postmaster logs). I suspect it might be due to a segmentation fault or a similar issue.
IsInplaceUpdateOid is used in systable_inplace_update_begin (recently introduced), but it doesn’t yet support pg_index. Based on check_lock_if_inplace_updateable_rel in heapam.c and IsInplaceUpdateOid in catalog.c, introducing support for in-place updates via this new mechanism might not be straightforward for pg_index. It would require updating the callers to handle in-place updates and locks, I presume?
I did confirm that, based on the v2 PATCH, when not doing in-place updates on pg_index - it means that the xmin for pg_index would increment. I suppose there’s a risk of indcheckxmin being marked as TRUE when xmin exceeds, potentially causing an index to be accidentally skipped ? (I am not 100% sure)
I'll take some time to think this through and familiarize myself with the new systable_inplace_update_begin. In the meantime, aside from the in-place updates on pg_index, I would love to learn any first impressions or feedback on the patch folks may have.
Thank you,
Shayon
Interesting idea.
This patch needs a rebase.
On Thu, 17 Oct 2024 at 15:59, Shayon Mukherjee <shayonj@gmail.com> wrote:
On Oct 16, 2024, at 6:01 PM, Shayon Mukherjee <shayonj@gmail.com> wrote:I'll take some time to think this through and familiarize myself with the new systable_inplace_update_begin. In the meantime, aside from the in-place updates on pg_index, I would love to learn any first impressions or feedback on the patch folks may have.My take away from whether or not an in-place update is needed on pg_index [1]- It’s unclear to me why it’s needed.- I understand the xmin would get incremented when using CatalogTupleUpdate to update indisenabled.- However, I haven’t been able to replicate any odd behavior locally or CI.- FWIW - REINDEX CONCURRENTLY (via index_swap), index_constraint_create and few other places perform CatalogTupleUpdate to update the relevant attributes as well.Based on the above summary and after my testing I would like to propose a v3 of the patch. The only difference between this and v1 [2] is that the update of pg_index row happens via CatalogTupleUpdate.Thank you for bearing with me on this :DShayon
Regards,
Rafia Sabih
Rafia Sabih
CYBERTEC PostgreSQL International GmbH
On Thu, Oct 17, 2024 at 9:59 AM Shayon Mukherjee <shayonj@gmail.com> wrote: > My take away from whether or not an in-place update is needed on pg_index [1] > > - It’s unclear to me why it’s needed. > - I understand the xmin would get incremented when using CatalogTupleUpdate to update indisenabled. > - However, I haven’t been able to replicate any odd behavior locally or CI. > - FWIW - REINDEX CONCURRENTLY (via index_swap), index_constraint_create and few other places perform CatalogTupleUpdateto update the relevant attributes as well. > > Based on the above summary and after my testing I would like to propose a v3 of the patch. The only difference betweenthis and v1 [2] is that the update of pg_index row happens via CatalogTupleUpdate. > > [1] https://www.postgresql.org/message-id/20180618215635.m5vrnxdxhxytvmcm@alap3.anarazel.de > [2] https://www.postgresql.org/message-id/EF2313B8-A017-4869-9B7F-A24EDD8795DE%40gmail.com In-place updates are generally bad news, so I think this patch shouldn't use them. However, someone will need to investigate whether that breaks the indcheckxmin thing that Andres mentions in your [1], and if it does, figure out what to do about it. Creating a test case to show the breakage would probably be a good first step, to frame the discussion. -- Robert Haas EDB: http://www.enterprisedb.com
> On Nov 5, 2024, at 10:55 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Oct 17, 2024 at 9:59 AM Shayon Mukherjee <shayonj@gmail.com> wrote: >> My take away from whether or not an in-place update is needed on pg_index [1] >> >> - It’s unclear to me why it’s needed. >> - I understand the xmin would get incremented when using CatalogTupleUpdate to update indisenabled. >> - However, I haven’t been able to replicate any odd behavior locally or CI. >> - FWIW - REINDEX CONCURRENTLY (via index_swap), index_constraint_create and few other places perform CatalogTupleUpdateto update the relevant attributes as well. >> >> Based on the above summary and after my testing I would like to propose a v3 of the patch. The only difference betweenthis and v1 [2] is that the update of pg_index row happens via CatalogTupleUpdate. >> >> [1] https://www.postgresql.org/message-id/20180618215635.m5vrnxdxhxytvmcm@alap3.anarazel.de >> [2] https://www.postgresql.org/message-id/EF2313B8-A017-4869-9B7F-A24EDD8795DE%40gmail.com > > In-place updates are generally bad news, so I think this patch > shouldn't use them. However, someone will need to investigate whether > that breaks the indcheckxmin thing that Andres mentions in your [1], > and if it does, figure out what to do about it. Creating a test case > to show the breakage would probably be a good first step, to frame the > discussion. Hello, Thank you for the guidance and tips. I was wondering if updating in-place is preferable or not, since my first instinct wasto avoid it. I did not notice any breakage last time, unless I was looking in the wrong place (possibly?). I did noticethe min update when a not-in-place update was performed, but I didn't notice any issues (as mentioned in [1]) fromit, via logs, index usage, or other common operations. Let me write up some more test cases to check if there is a riskof indexcheckxmin running out or other issues, and I'll get back here. Thank you Shayon [1] https://www.postgresql.org/message-id/20180618215635.m5vrnxdxhxytvmcm@alap3.anarazel.de > > -- > Robert Haas > EDB: http://www.enterprisedb.com
On Tue, 26 Nov 2024 at 11:34, Shayon Mukherjee <shayonj@gmail.com> wrote: > Thank you for the guidance and tips. I was wondering if updating in-place is preferable or not, since my first instinctwas to avoid it. I did not notice any breakage last time, unless I was looking in the wrong place (possibly?). Idid notice the min update when a not-in-place update was performed, but I didn't notice any issues (as mentioned in [1])from it, via logs, index usage, or other common operations. Let me write up some more test cases to check if there isa risk of indexcheckxmin running out or other issues, and I'll get back here. Another safer option could be to disallow the enable/disable ALTER TABLE if indcheckxmin is true. We do have and use ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE for these sorts of issues. There are other existing failure modes relating to indexes that can depend on what another session has done, e.g. MarkInheritDetached() can fail if another session detached an index concurrently. I could respect an argument that this one might be worse than that as its timing dependent rather than dependent on what another session has done. David