Proposal to Enable/Disable Index using ALTER INDEX
Hello hackers,
This is my first time posting here, and I’d like to propose a new feature
related to PostgreSQL indexes. If this idea resonates, I’d be happy to
follow up with a patch as well.
*Problem*:
Adding and removing indexes is a common operation in PostgreSQL. On larger
databases, however, these operations can be resource-intensive. When
evaluating the performance impact of one or more indexes, dropping them
might not be ideal since as a user you may want a quicker way to test their
effects without fully committing to removing & adding them back again.
Which can be a time taking operation on larger tables.
*Proposal*:
I propose adding an ALTER INDEX command that allows for enabling or
disabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;
A disabled index would still receive updates and enforce constraints as
usual but would not be used for queries. This allows users to assess
whether an index impacts query performance before deciding to drop it
entirely.
*Implementation*:
To keep this simple, I suggest toggling the indisvalid flag in pg_index
during the enable/disable operation.
*Additional Considerations*:
- Keeping the index up-to-date while it’s disabled seems preferable, as it
avoids the need to rebuild the index if it’s re-enabled later. The
alternative would be dropping and rebuilding the index upon re-enabling,
which I believe would introduce additional overhead in terms of application
logic & complexity.
- I am also proposing to reuse the existing indisvalid flag to avoid adding
new state and the maintenance that comes with it, but I’m open to feedback
if this approach has potential downsides.
- To keep the scope minimal for now, I propose that we only allow enabling
and disabling indexes globally, and not locally, by supporting it
exclusively in ALTER INDEX. I would love to know if this would break any
SQL grammar promises that I might be unaware of.
I would love to learn if this sounds like a good idea and how it can be
improved further. Accordingly, as a next step I would be very happy to
propose a patch as well.
Best regards,
Shayon Mukherjee
On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote:
Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
Proposal:
I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.
I personally think having some way to alter an index to stop it from
being used in query plans would be very useful for the reasons you
mentioned. I don't have any arguments against the syntax you've
proposed. We'd certainly have to clearly document that constraints
are still enforced. Perhaps there is some other syntax which would
self-document slightly better. I just can't think of it right now.
Implementation:
To keep this simple, I suggest toggling the indisvalid flag in pg_index during the enable/disable operation.
That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be
used to make valid a failed concurrently created index. I think this
would need a new flag and everywhere in the planner would need to be
adjusted to ignore indexes when that flag is false.
Additional Considerations:
- Keeping the index up-to-date while it’s disabled seems preferable, as it avoids the need to rebuild the index if it’s re-enabled later. The alternative would be dropping and rebuilding the index upon re-enabling, which I believe would introduce additional overhead in terms of application logic & complexity.
I think the primary use case here is to assist in dropping useless
indexes in a way that can very quickly be undone if the index is more
useful than thought. If you didn't keep the index up-to-date then that
would make the feature useless for that purpose.
If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.
David
Hi Shayon
Thank you for your work on this , I think it's great to have this
feature implemented ,I checked the doucment on other databases,It seems
both MySQL 8.0 and oracle supports it, sql server need to rebuild indexes
after disabled,It seems disable the index, it's equivalent to deleting
the index, except that the index's metadata is still retained:
https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/invisible-indexes.html
https://learn.microsoft.com/en-us/sql/t-sql/statements/alter-index-transact-sql?view=sql-server-ver16
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/ALTER-INDEX.html
->A disabled index would still receive updates and enforce constraints as
usual but would not be used for queries. This allows users to assess ->
->whether an index impacts query performance before deciding to drop it
entirely.
MySQL 8.0 and oracle settings are not visible, index information is always
updated, I would then suggest that the statement be changed to set the
index invisible and visible.
Thanks
David Rowley <dgrowleyml@gmail.com> 于2024年9月10日周二 06:17写道:
Show quoted text
On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote:
Adding and removing indexes is a common operation in PostgreSQL. On
larger databases, however, these operations can be resource-intensive. When
evaluating the performance impact of one or more indexes, dropping them
might not be ideal since as a user you may want a quicker way to test their
effects without fully committing to removing & adding them back again.
Which can be a time taking operation on larger tables.Proposal:
I propose adding an ALTER INDEX command that allows for enabling ordisabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;A disabled index would still receive updates and enforce constraints as
usual but would not be used for queries. This allows users to assess
whether an index impacts query performance before deciding to drop it
entirely.I personally think having some way to alter an index to stop it from
being used in query plans would be very useful for the reasons you
mentioned. I don't have any arguments against the syntax you've
proposed. We'd certainly have to clearly document that constraints
are still enforced. Perhaps there is some other syntax which would
self-document slightly better. I just can't think of it right now.Implementation:
To keep this simple, I suggest toggling the indisvalid flag in pg_indexduring the enable/disable operation.
That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be
used to make valid a failed concurrently created index. I think this
would need a new flag and everywhere in the planner would need to be
adjusted to ignore indexes when that flag is false.Additional Considerations:
- Keeping the index up-to-date while it’s disabled seems preferable, asit avoids the need to rebuild the index if it’s re-enabled later. The
alternative would be dropping and rebuilding the index upon re-enabling,
which I believe would introduce additional overhead in terms of application
logic & complexity.I think the primary use case here is to assist in dropping useless
indexes in a way that can very quickly be undone if the index is more
useful than thought. If you didn't keep the index up-to-date then that
would make the feature useless for that purpose.If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.David
Hi,
On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote:
On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote:
Adding and removing indexes is a common operation in PostgreSQL. On
larger databases, however, these operations can be
resource-intensive. When evaluating the performance impact of one or
more indexes, dropping them might not be ideal since as a user you
may want a quicker way to test their effects without fully
committing to removing & adding them back again. Which can be a time
taking operation on larger tables.Proposal:
I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;A disabled index would still receive updates and enforce constraints
as usual but would not be used for queries. This allows users to
assess whether an index impacts query performance before deciding to
drop it entirely.I personally think having some way to alter an index to stop it from
being used in query plans would be very useful for the reasons you
mentioned. I don't have any arguments against the syntax you've
proposed. We'd certainly have to clearly document that constraints
are still enforced. Perhaps there is some other syntax which would
self-document slightly better. I just can't think of it right now.Implementation:
To keep this simple, I suggest toggling the indisvalid flag in
pg_index during the enable/disable operation.That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be
used to make valid a failed concurrently created index. I think this
would need a new flag and everywhere in the planner would need to be
adjusted to ignore indexes when that flag is false.
How about the indislive flag instead? I haven't looked at the code, but
from the documentation ("If false, the index is in process of being
dropped, and
should be ignored for all purposes") it sounds like we made be able to
piggy-back on that instead?
Michael
On Tue, 10 Sept 2024 at 22:46, Michael Banck <mbanck@gmx.net> wrote:
How about the indislive flag instead? I haven't looked at the code, but
from the documentation ("If false, the index is in process of being
dropped, and
should be ignored for all purposes") it sounds like we made be able to
piggy-back on that instead?
Doing that could cause an UPDATE which would ordinarily not be
eligible for a HOT-update to become a HOT-update. That would cause
issues if the index is enabled again as the index wouldn't have been
updated during the UPDATE.
I don't see the big deal with adding a new flag. There's even a free
padding byte to put this flag in after indisreplident, so we don't
have to worry about using more memory.
David
+1 for the new flag as well, since it'd be nice to be able to
enable/disable indexes without having to worry about the missed updates /
having to rebuild it.
Shayon
On Tue, Sep 10, 2024 at 8:02 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 10 Sept 2024 at 22:46, Michael Banck <mbanck@gmx.net> wrote:
How about the indislive flag instead? I haven't looked at the code, but
from the documentation ("If false, the index is in process of being
dropped, and
should be ignored for all purposes") it sounds like we made be able to
piggy-back on that instead?Doing that could cause an UPDATE which would ordinarily not be
eligible for a HOT-update to become a HOT-update. That would cause
issues if the index is enabled again as the index wouldn't have been
updated during the UPDATE.I don't see the big deal with adding a new flag. There's even a free
padding byte to put this flag in after indisreplident, so we don't
have to worry about using more memory.David
--
Kind Regards,
Shayon Mukherjee
Hello,
Thank you for the detailed information and feedback David. Comments inline.
P.S Re-sending it to the mailing list, because I accidentally didn't select
reply-all on the last reply.
On Mon, Sep 9, 2024 at 6:16 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee <shayonj@gmail.com> wrote:
Adding and removing indexes is a common operation in PostgreSQL. On
larger databases, however, these operations can be resource-intensive. When
evaluating the performance impact of one or more indexes, dropping them
might not be ideal since as a user you may want a quicker way to test their
effects without fully committing to removing & adding them back again.
Which can be a time taking operation on larger tables.Proposal:
I propose adding an ALTER INDEX command that allows for enabling ordisabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;A disabled index would still receive updates and enforce constraints as
usual but would not be used for queries. This allows users to assess
whether an index impacts query performance before deciding to drop it
entirely.I personally think having some way to alter an index to stop it from
being used in query plans would be very useful for the reasons you
mentioned. I don't have any arguments against the syntax you've
proposed. We'd certainly have to clearly document that constraints
are still enforced. Perhaps there is some other syntax which would
self-document slightly better. I just can't think of it right now.
Thank you and likewise. I was thinking of piggy backing off of VALID / NOT
VALID, but that might have similar issues (if not more confusion) to the
current proposed syntax. Will be sure to update the documentation.
Implementation:
To keep this simple, I suggest toggling the indisvalid flag in pg_indexduring the enable/disable operation.
That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be
used to make valid a failed concurrently created index. I think this
would need a new flag and everywhere in the planner would need to be
adjusted to ignore indexes when that flag is false.
That is a great call and I wasn't thinking of the semantics with the
existing usage of concurrently created indexes.
Additional Considerations:
- Keeping the index up-to-date while it’s disabled seems preferable, asit avoids the need to rebuild the index if it’s re-enabled later. The
alternative would be dropping and rebuilding the index upon re-enabling,
which I believe would introduce additional overhead in terms of application
logic & complexity.I think the primary use case here is to assist in dropping useless
indexes in a way that can very quickly be undone if the index is more
useful than thought. If you didn't keep the index up-to-date then that
would make the feature useless for that purpose.
+1
If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.David
Thank you for the feedback again, I will look into the changes required and
accordingly propose a PATCH.
--
Kind Regards,
Shayon Mukherjee
On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote:
I think the primary use case here is to assist in dropping useless
indexes in a way that can very quickly be undone if the index is more
useful than thought. If you didn't keep the index up-to-date then that
would make the feature useless for that purpose.If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.
+1, this is something I've wanted for some time. There was some past
discussion, too [0]/messages/by-id/flat/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com.
[0]: /messages/by-id/flat/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com
--
nathan
On Wed, 11 Sept 2024 at 03:12, Nathan Bossart <nathandbossart@gmail.com> wrote:
On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote:
If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.+1, this is something I've wanted for some time. There was some past
discussion, too [0].[0] /messages/by-id/flat/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com
Thanks for digging that up. I'd forgotten about that. I see there was
pushback from having this last time, which is now over 6 years ago.
In the meantime, we still have nothing to make this easy for people.
I think the most important point I read in that thread is [1]/messages/by-id/20180618215635.m5vrnxdxhxytvmcm@alap3.anarazel.de. Maybe
what I mentioned in [2]/messages/by-id/CAKJS1f_L7y_BTGESp5Qd6BSRHXP0mj3x9O9C_U27GU478UwpBw@mail.gmail.com is a good workaround.
Additionally, I think there will need to be syntax in CREATE INDEX for
this. Without that pg_get_indexdef() might return SQL that does not
reflect the current state of the index. MySQL seems to use "CREATE
INDEX name ON table (col) [VISIBLE|INVISIBLE]".
David
[1]: /messages/by-id/20180618215635.m5vrnxdxhxytvmcm@alap3.anarazel.de
[2]: /messages/by-id/CAKJS1f_L7y_BTGESp5Qd6BSRHXP0mj3x9O9C_U27GU478UwpBw@mail.gmail.com
Hello,
Thank you for all the feedback and insights. Work was busy, so I didn't get
to follow up earlier.
This patch introduces the ability to enable or disable indexes using ALTER
INDEX
and CREATE INDEX commands.
Original motivation for the problem and proposal for a patch
can be found here[0]/messages/by-id/CANqtF-oXKe0M=0QOih6H+sZRjE2BWAbkW_1+9nMEAMLxUJg5jA@mail.gmail.com
This patch contains the relevant implementation details, new regression
tests and documentation.
It passes all the existing specs and the newly added regression tests. It
compiles, so the
patch can be applied for testing as well.
I have attached the patch in this email, and have also shared it on my
Github fork[1]https://github.com/shayonj/postgres/pull/1. Mostly so
that I can ensure the full CI passes.
*Implementation details:*
- New Grammar:
* ALTER INDEX ... ENABLE/DISABLE
* CREATE INDEX ... DISABLE
- Default state is enabled. Indexes are only disabled when explicitly
instructed via CREATE INDEX ... DISABLE or ALTER INDEX ... DISABLE.
- Primary Key and Unique constraint indexes are always enabled as well. The
ENABLE/DISABLE grammar is not supported for these types of indexes. They
can
be later disabled via ALTER INDEX ... ENABLE/DISABLE.
- ALTER INDEX ... ENABLE/DISABLE performs an in-place update of the
pg_index
catalog to protect against indcheckxmin.
- pg_get_indexdef() support for the new functionality and grammar. This
change is
reflected in \d output for tables and pg_dump. We show the DISABLE syntax
accordingly.
- Updated create_index.sql regression test to cover the new grammar and
verify
that disabled indexes are not used in queries.
- Modified get_index_paths() and build_index_paths() to exclude disabled
indexes from consideration during query planning.
- No changes are made to stop the index from getting rebuilt. This way we
ensure no
data miss or corruption when index is re-enabled.
- TOAST indexes are supported and enabled by default.
- REINDEX CONCURRENTLY is supported as well and the existing state of
pg_index.indisenabled
is carried over accordingly.
- catversion.h is updated with a new CATALOG_VERSION_NO to reflect change
in pg_index
schema.
- See the changes in create_index.sql to get an idea of the grammar and sql
statements.
- See the changes in create_index.out to get an idea of the catalogue
states and EXPLAIN
output to see when an index is getting used or isn't (when disabled).
I am looking forward to any and all feedback on this patch, including but
not limited to
code quality, tests, and fundamental logic.
Thank you for the reviews and feedback.
[0]: /messages/by-id/CANqtF-oXKe0M=0QOih6H+sZRjE2BWAbkW_1+9nMEAMLxUJg5jA@mail.gmail.com
/messages/by-id/CANqtF-oXKe0M=0QOih6H+sZRjE2BWAbkW_1+9nMEAMLxUJg5jA@mail.gmail.com
[1]: https://github.com/shayonj/postgres/pull/1
Best,
Shayon
On Tue, Sep 10, 2024 at 5:35 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 11 Sept 2024 at 03:12, Nathan Bossart <nathandbossart@gmail.com>
wrote:On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote:
If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.+1, this is something I've wanted for some time. There was some past
discussion, too [0].[0]
/messages/by-id/flat/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com
Thanks for digging that up. I'd forgotten about that. I see there was
pushback from having this last time, which is now over 6 years ago.
In the meantime, we still have nothing to make this easy for people.I think the most important point I read in that thread is [1]. Maybe
what I mentioned in [2] is a good workaround.Additionally, I think there will need to be syntax in CREATE INDEX for
this. Without that pg_get_indexdef() might return SQL that does not
reflect the current state of the index. MySQL seems to use "CREATE
INDEX name ON table (col) [VISIBLE|INVISIBLE]".David
[1]
/messages/by-id/20180618215635.m5vrnxdxhxytvmcm@alap3.anarazel.de
[2]
/messages/by-id/CAKJS1f_L7y_BTGESp5Qd6BSRHXP0mj3x9O9C_U27GU478UwpBw@mail.gmail.com
--
Kind Regards,
Shayon Mukherjee
Attachments:
0001-Introduce-the-ability-to-enable-disable-indexes.patchapplication/octet-stream; name=0001-Introduce-the-ability-to-enable-disable-indexes.patchDownload+672-25
Hello,
I realized there were some white spaces in the diff and a compiler warning
error from CI, so I have fixed those and the updated patch with v2 is now
attached.
Shayon
On Sun, Sep 22, 2024 at 1:42 PM Shayon Mukherjee <shayonj@gmail.com> wrote:
Hello,
Thank you for all the feedback and insights. Work was busy, so I didn't
get to follow up earlier.This patch introduces the ability to enable or disable indexes using ALTER
INDEX
and CREATE INDEX commands.Original motivation for the problem and proposal for a patch
can be found here[0]This patch contains the relevant implementation details, new regression
tests and documentation.
It passes all the existing specs and the newly added regression tests. It
compiles, so the
patch can be applied for testing as well.I have attached the patch in this email, and have also shared it on my
Github fork[1]. Mostly so
that I can ensure the full CI passes.*Implementation details:*
- New Grammar:
* ALTER INDEX ... ENABLE/DISABLE
* CREATE INDEX ... DISABLE- Default state is enabled. Indexes are only disabled when explicitly
instructed via CREATE INDEX ... DISABLE or ALTER INDEX ... DISABLE.- Primary Key and Unique constraint indexes are always enabled as well.
The
ENABLE/DISABLE grammar is not supported for these types of indexes. They
can
be later disabled via ALTER INDEX ... ENABLE/DISABLE.- ALTER INDEX ... ENABLE/DISABLE performs an in-place update of the
pg_index
catalog to protect against indcheckxmin.- pg_get_indexdef() support for the new functionality and grammar. This
change is
reflected in \d output for tables and pg_dump. We show the DISABLE
syntax accordingly.- Updated create_index.sql regression test to cover the new grammar and
verify
that disabled indexes are not used in queries.- Modified get_index_paths() and build_index_paths() to exclude disabled
indexes from consideration during query planning.- No changes are made to stop the index from getting rebuilt. This way we
ensure no
data miss or corruption when index is re-enabled.- TOAST indexes are supported and enabled by default.
- REINDEX CONCURRENTLY is supported as well and the existing state of
pg_index.indisenabled
is carried over accordingly.- catversion.h is updated with a new CATALOG_VERSION_NO to reflect change
in pg_index
schema.- See the changes in create_index.sql to get an idea of the grammar and
sql statements.- See the changes in create_index.out to get an idea of the catalogue
states and EXPLAIN
output to see when an index is getting used or isn't (when disabled).I am looking forward to any and all feedback on this patch, including but
not limited to
code quality, tests, and fundamental logic.Thank you for the reviews and feedback.
[0]
/messages/by-id/CANqtF-oXKe0M=0QOih6H+sZRjE2BWAbkW_1+9nMEAMLxUJg5jA@mail.gmail.com
[1] https://github.com/shayonj/postgres/pull/1Best,
ShayonOn Tue, Sep 10, 2024 at 5:35 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 11 Sept 2024 at 03:12, Nathan Bossart <nathandbossart@gmail.com>
wrote:On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote:
If we get the skip scan feature for PG18, then there's likely going to
be lots of people with indexes that they might want to consider
removing after upgrading. Maybe this is a good time to consider this
feature as it possibly won't ever be more useful than it will be after
we get skip scans.+1, this is something I've wanted for some time. There was some past
discussion, too [0].[0]
/messages/by-id/flat/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com
Thanks for digging that up. I'd forgotten about that. I see there was
pushback from having this last time, which is now over 6 years ago.
In the meantime, we still have nothing to make this easy for people.I think the most important point I read in that thread is [1]. Maybe
what I mentioned in [2] is a good workaround.Additionally, I think there will need to be syntax in CREATE INDEX for
this. Without that pg_get_indexdef() might return SQL that does not
reflect the current state of the index. MySQL seems to use "CREATE
INDEX name ON table (col) [VISIBLE|INVISIBLE]".David
[1]
/messages/by-id/20180618215635.m5vrnxdxhxytvmcm@alap3.anarazel.de
[2]
/messages/by-id/CAKJS1f_L7y_BTGESp5Qd6BSRHXP0mj3x9O9C_U27GU478UwpBw@mail.gmail.com--
Kind Regards,
Shayon Mukherjee
--
Kind Regards,
Shayon Mukherjee
Attachments:
v2-0001-Introduce-the-ability-to-enable-disable-indexes.patchapplication/octet-stream; name=v2-0001-Introduce-the-ability-to-enable-disable-indexes.patchDownload+671-12
On Mon, 23 Sept 2024 at 05:43, Shayon Mukherjee <shayonj@gmail.com> wrote:
- Modified get_index_paths() and build_index_paths() to exclude disabled
indexes from consideration during query planning.
There are quite a large number of other places you also need to modify.
Here are 2 places where the index should be ignored but isn't:
1. expression indexes seem to still be used for statistical estimations:
create table b as select generate_series(1,1000)b;
create index on b((b%10));
analyze b;
explain select distinct b%10 from b;
-- HashAggregate (cost=23.00..23.12 rows=10 width=4)
alter index b_expr_idx disable;
explain select distinct b%10 from b;
-- HashAggregate (cost=23.00..23.12 rows=10 width=4) <-- should be 1000 rows
drop index b_expr_idx;
explain select distinct b%10 from b;
-- HashAggregate (cost=23.00..35.50 rows=1000 width=4)
2. Indexes seem to still be used for join removals.
create table c (c int primary key);
explain select c1.* from c c1 left join c c2 on c1.c=c2.c; --
correctly removes join.
alter index c_pkey disable;
explain select c1.* from c c1 left join c c2 on c1.c=c2.c; -- should
not remove join.
Please carefully look over all places that RelOptInfo.indexlist is
looked at and consider skipping disabled indexes. Please also take
time to find SQL that exercises each of those places so you can verify
that the behaviour is correct after your change. This is also a good
way to learn exactly all cases where indexes are used. Using this
method would have led you to find places like
rel_supports_distinctness(), where you should be skipping disabled
indexes.
The planner should not be making use of disabled indexes for any
optimisations at all.
- catversion.h is updated with a new CATALOG_VERSION_NO to reflect change in pg_index
schema.
Please leave that up to the committer. Patch authors doing this just
results in the patch no longer applying as soon as someone commits a
version bump.
Also, please get rid of these notices. The command tag serves that
purpose. It's not interesting that the index is already disabled.
# alter index a_pkey disable;
NOTICE: index "a_pkey" is now disabled
ALTER INDEX
# alter index a_pkey disable;
NOTICE: index "a_pkey" is already disabled
ALTER INDEX
I've only given the code a very quick glance. I don't quite understand
why you're checking the index is enabled in create_index_paths() and
get_index_paths(). I think the check should be done only in
create_index_paths(). Primarily, you'll find code such as "if
(index->indpred != NIL && !index->predOK)" in the locations you need
to consider skipping the disabled index. I think your new code should
be located very close to those places or perhaps within the same if
condition unless it makes it overly complex for the human reader.
I think the documents should also mention that disabling an index is a
useful way to verify an index is not being used before dropping it as
the index can be enabled again at the first sign that performance has
been effected. (It might also be good to mention that checking
pg_stat_user_indexes.idx_scan should be the first port of call when
checking for unused indexes)
David
Hi David,
Thank you so much for the review and pointers. I totally missed expression indexes. I am going to do another proper pass along with your feedback and follow up with an updated patch and any questions.
Excited to be learning so much about the internals.
Shayon
Show quoted text
On Sep 22, 2024, at 6:44 PM, David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 23 Sept 2024 at 05:43, Shayon Mukherjee <shayonj@gmail.com> wrote:
- Modified get_index_paths() and build_index_paths() to exclude disabled
indexes from consideration during query planning.There are quite a large number of other places you also need to modify.
Here are 2 places where the index should be ignored but isn't:
1. expression indexes seem to still be used for statistical estimations:
create table b as select generate_series(1,1000)b;
create index on b((b%10));
analyze b;
explain select distinct b%10 from b;
-- HashAggregate (cost=23.00..23.12 rows=10 width=4)alter index b_expr_idx disable;
explain select distinct b%10 from b;
-- HashAggregate (cost=23.00..23.12 rows=10 width=4) <-- should be 1000 rowsdrop index b_expr_idx;
explain select distinct b%10 from b;
-- HashAggregate (cost=23.00..35.50 rows=1000 width=4)2. Indexes seem to still be used for join removals.
create table c (c int primary key);
explain select c1.* from c c1 left join c c2 on c1.c=c2.c; --
correctly removes join.
alter index c_pkey disable;
explain select c1.* from c c1 left join c c2 on c1.c=c2.c; -- should
not remove join.Please carefully look over all places that RelOptInfo.indexlist is
looked at and consider skipping disabled indexes. Please also take
time to find SQL that exercises each of those places so you can verify
that the behaviour is correct after your change. This is also a good
way to learn exactly all cases where indexes are used. Using this
method would have led you to find places like
rel_supports_distinctness(), where you should be skipping disabled
indexes.The planner should not be making use of disabled indexes for any
optimisations at all.- catversion.h is updated with a new CATALOG_VERSION_NO to reflect change in pg_index
schema.Please leave that up to the committer. Patch authors doing this just
results in the patch no longer applying as soon as someone commits a
version bump.Also, please get rid of these notices. The command tag serves that
purpose. It's not interesting that the index is already disabled.# alter index a_pkey disable;
NOTICE: index "a_pkey" is now disabled
ALTER INDEX
# alter index a_pkey disable;
NOTICE: index "a_pkey" is already disabled
ALTER INDEXI've only given the code a very quick glance. I don't quite understand
why you're checking the index is enabled in create_index_paths() and
get_index_paths(). I think the check should be done only in
create_index_paths(). Primarily, you'll find code such as "if
(index->indpred != NIL && !index->predOK)" in the locations you need
to consider skipping the disabled index. I think your new code should
be located very close to those places or perhaps within the same if
condition unless it makes it overly complex for the human reader.I think the documents should also mention that disabling an index is a
useful way to verify an index is not being used before dropping it as
the index can be enabled again at the first sign that performance has
been effected. (It might also be good to mention that checking
pg_stat_user_indexes.idx_scan should be the first port of call when
checking for unused indexes)David
On 09.09.24 23:38, Shayon Mukherjee wrote:
*Problem*:
Adding and removing indexes is a common operation in PostgreSQL. On
larger databases, however, these operations can be resource-intensive.
When evaluating the performance impact of one or more indexes, dropping
them might not be ideal since as a user you may want a quicker way to
test their effects without fully committing to removing & adding them
back again. Which can be a time taking operation on larger tables.*Proposal*:
I propose adding an ALTER INDEX command that allows for enabling or
disabling an index globally. This could look something like:ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;A disabled index would still receive updates and enforce constraints as
usual but would not be used for queries. This allows users to assess
whether an index impacts query performance before deciding to drop it
entirely.
I think a better approach would be to make the list of disabled indexes
a GUC setting, which would then internally have an effect similar to
enable_indexscan, meaning it would make the listed indexes unattractive
to the planner.
This seems better than the proposed DDL command, because you'd be able
to use this per-session, instead of forcing a global state, and even
unprivileged users could use it.
(I think we have had proposals like this before, but I can't find the
discussion I'm thinking of right now.)
That's a good point.
+1 for the idea of the GUC setting, especially since, as you mentioned, it allows unprivileged users to access it and being per-session..
I am happy to draft a patch for this as well. I think I have a working idea so far of where the necessary checks might go. However if you don’t mind, can you elaborate further on how the effect would be similar to enable_indexscan?
I was thinking we could introduce a new GUC option called `disabled_indexes` and perform a check against in all places for each index being considered with its OID via get_relname_relid through a helper function in the various places we need to prompt the planner to not use the index (like in indxpath.c as an example).
Curious to learn if you have a different approach in mind perhaps?
Thank you,
Shayon
Show quoted text
On Sep 23, 2024, at 11:14 AM, Peter Eisentraut <peter@eisentraut.org> wrote:
On 09.09.24 23:38, Shayon Mukherjee wrote:
*Problem*:
Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
*Proposal*:
I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;
A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.I think a better approach would be to make the list of disabled indexes a GUC setting, which would then internally have an effect similar to enable_indexscan, meaning it would make the listed indexes unattractive to the planner.
This seems better than the proposed DDL command, because you'd be able to use this per-session, instead of forcing a global state, and even unprivileged users could use it.
(I think we have had proposals like this before, but I can't find the discussion I'm thinking of right now.)
I found an old thread here [0]https://postgrespro.com/list/id/20151212.112536.1628974191058745674.t-ishii@sraoss.co.jp.
Also, a question: If we go with the GUC approach, how do we expect `pg_get_indexdef` to behave?
I suppose it would behave no differently than it otherwise would, because there's no new SQL grammar to support and, given its GUC status, it seems reasonable that `pg_get_indexdef` doesn’t reflect whether an index is enabled or not.
If so, then I wonder if using a dedicated `ALTER` command and keeping the state in `pg_index` would be better for consistency's sake?
[0]: https://postgrespro.com/list/id/20151212.112536.1628974191058745674.t-ishii@sraoss.co.jp
Thank you
Shayon
Show quoted text
On Sep 23, 2024, at 4:51 PM, Shayon Mukherjee <shayonj@gmail.com> wrote:
That's a good point.
+1 for the idea of the GUC setting, especially since, as you mentioned, it allows unprivileged users to access it and being per-session..
I am happy to draft a patch for this as well. I think I have a working idea so far of where the necessary checks might go. However if you don’t mind, can you elaborate further on how the effect would be similar to enable_indexscan?
I was thinking we could introduce a new GUC option called `disabled_indexes` and perform a check against in all places for each index being considered with its OID via get_relname_relid through a helper function in the various places we need to prompt the planner to not use the index (like in indxpath.c as an example).
Curious to learn if you have a different approach in mind perhaps?
Thank you,
ShayonOn Sep 23, 2024, at 11:14 AM, Peter Eisentraut <peter@eisentraut.org> wrote:
On 09.09.24 23:38, Shayon Mukherjee wrote:
*Problem*:
Adding and removing indexes is a common operation in PostgreSQL. On larger databases, however, these operations can be resource-intensive. When evaluating the performance impact of one or more indexes, dropping them might not be ideal since as a user you may want a quicker way to test their effects without fully committing to removing & adding them back again. Which can be a time taking operation on larger tables.
*Proposal*:
I propose adding an ALTER INDEX command that allows for enabling or disabling an index globally. This could look something like:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;
A disabled index would still receive updates and enforce constraints as usual but would not be used for queries. This allows users to assess whether an index impacts query performance before deciding to drop it entirely.I think a better approach would be to make the list of disabled indexes a GUC setting, which would then internally have an effect similar to enable_indexscan, meaning it would make the listed indexes unattractive to the planner.
This seems better than the proposed DDL command, because you'd be able to use this per-session, instead of forcing a global state, and even unprivileged users could use it.
(I think we have had proposals like this before, but I can't find the discussion I'm thinking of right now.)
On Tue, 24 Sept 2024 at 03:14, Peter Eisentraut <peter@eisentraut.org> wrote:
On 09.09.24 23:38, Shayon Mukherjee wrote:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;I think a better approach would be to make the list of disabled indexes
a GUC setting, which would then internally have an effect similar to
enable_indexscan, meaning it would make the listed indexes unattractive
to the planner.
I understand the last discussion went down that route too. For me, it
seems strange that adding some global variable is seen as cleaner than
storing the property in the same location as all the other index
properties.
How would you ensure no cached plans are still using the index after
changing the GUC?
This seems better than the proposed DDL command, because you'd be able
to use this per-session, instead of forcing a global state, and even
unprivileged users could use it.
That's true.
(I think we have had proposals like this before, but I can't find the
discussion I'm thinking of right now.)
I think it's the one that was already linked by Nathan. [1]/messages/by-id/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com? The GUC
seems to have been first suggested on the same thread in [2]/messages/by-id/29800.1529359024@sss.pgh.pa.us.
David
[1]: /messages/by-id/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com
[2]: /messages/by-id/29800.1529359024@sss.pgh.pa.us
On Mon, Sep 23, 2024 at 8:31 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 24 Sept 2024 at 03:14, Peter Eisentraut <peter@eisentraut.org>
wrote:On 09.09.24 23:38, Shayon Mukherjee wrote:
ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;I think a better approach would be to make the list of disabled indexes
a GUC setting, which would then internally have an effect similar to
enable_indexscan, meaning it would make the listed indexes unattractive
to the planner.I understand the last discussion went down that route too. For me, it
seems strange that adding some global variable is seen as cleaner than
storing the property in the same location as all the other index
properties.
That was my first instinct as well. Although, being able to control this
setting on a per session level and as an unprivileged user is somewhat
attractive.
How would you ensure no cached plans are still using the index after
changing the GUC?
Could we call ResetPlanCache() to invalidate all plan caches from the
assign_ hook on GUC when it's set (and doesn't match the old value).
Something like this (assuming the GUC is called `disabled_indexes`)
void
assign_disabled_indexes(const char *newval, void *extra)
{
if (disabled_indexes != newval)
ResetPlanCache();
}
A bit heavy-handed, but perhaps it's OK, since it's not meant to be used
frequently also ?
This seems better than the proposed DDL command, because you'd be able
to use this per-session, instead of forcing a global state, and even
unprivileged users could use it.That's true.
(I think we have had proposals like this before, but I can't find the
discussion I'm thinking of right now.)I think it's the one that was already linked by Nathan. [1]? The GUC
seems to have been first suggested on the same thread in [2].David
[1]
/messages/by-id/ed8c9ed7-bb5d-aaec-065b-ad4893645deb@2ndQuadrant.com
[2] /messages/by-id/29800.1529359024@sss.pgh.pa.us
Shayon
If one of the use cases is soft-dropping indexes, would a GUC approach
still support that? ALTER TABLE?
Hello,
Regarding GUC implementation for index disabling, I was imagining something
like the attached PATCH. The patch compiles and can be applied for testing.
It's not meant to be production ready, but I am sharing it as a way to get
a sense of the nuts and bolts. It requires more proper test cases and docs,
etc. Example towards the end of the email.
That said, I am still quite torn between GUC setting or having a dedicated
ALTER grammar. My additional thoughts which is mostly a summary of what
David and Peter have already very nicely raised earlier are:
- GUC allows a non-privileged user to disable one or more indexes per
session.
- If we think of the task of disabling indexes temporarily (without
stopping any updates to the index), then it feels more in the territory of
query tuning than index maintenance. In which case, a GUC setting makes
more sense and sits well with others in the team like enable_indexscan,
enable_indexonlyscan and so on.
- At the same time, as David pointed out earlier, GUC is also a global
setting and perhaps storing the state of whether or not an index is being
used is perhaps better situated along with other index properties in
pg_index.
- One of my original motivations for the proposal was also that we can
disable an index for _all_ sessions quickly without it impacting index
build and turn it back on quickly as well. To do so with GUC, we would need
to do something like the following, if I am not mistaken, in which case
that is not something an unprivileged user may be able to perform, so just
calling it out.
ALTER USER example_user SET disabled_indexes = 'idx_foo_bar';
- For an ALTER statement, I think an ALTER INDEX makes more sense than
ALTER TABLE, especially since we have the existing ALTER INDEX grammar and
functionality. But let me know if I am missing something here.
- Resetting plan cache could still be an open question for GUC. I was
wondering if we can reset the plan cache local to the session for GUC (like
the one in the PATCH attached) and if that is enough? This concern doesn't
apply with managing property in pg_index.
- With a GUC attribute, the state of an index being enabled/disabled won't
be captured in pg_get_indexdef(), and that is likely OK, but maybe that
would need to be made explicit through docs.
Example 1
CREATE TABLE b AS SELECT generate_series(1,1000) AS b;
CREATE INDEX ON b((b%10));
ANALYZE b;
EXPLAIN SELECT DISTINCT b%10 FROM b;
SET disabled_indexes = 'b_expr_idx';
EXPLAIN SELECT DISTINCT b%10 FROM b; -- HashAggregate rows=10000
Example 2
CREATE TABLE disabled_index_test(id int PRIMARY KEY, data text);
INSERT INTO disabled_index_test SELECT g, 'data ' || g FROM
generate_series(1, 1000) g;
CREATE INDEX disabled_index_idx1 ON disabled_index_test(data);
EXPLAIN (COSTS OFF) SELECT * FROM disabled_index_test WHERE data = 'data
500';
SET disabled_indexes = 'b_expr_idx, disabled_index_idx1';
EXPLAIN SELECT * FROM disabled_index_test WHERE data = 'data 500'; -- no
index is used
Wrapping up...
I am sure there are things I am missing or unintentionally overlooking.
Since this would be a nice feature to have, I'd love some guidance on which
approach seems like a good next step to take. I am happy to work
accordingly on the patch.
Thank you
Shayon
On Tue, Sep 24, 2024 at 12:38 AM Maciek Sakrejda <m.sakrejda@gmail.com>
wrote:
If one of the use cases is soft-dropping indexes, would a GUC approach
still support that? ALTER TABLE?
--
Kind Regards,
Shayon Mukherjee