Thread: Disallow redundant indexes

Disallow redundant indexes

From
Japin Li
Date:
Hi, hackers

Currently, PostgreSQL permits creating multiple indexes on the same columns
in the same order for a table, potentially leading to redundant indexes.
For example:

    CREATE INDEX ON t(id);
    CREATE INDEX ON t(id);

While permitted, this leads to:

- Increased storage consumption
- Performance degradation (for data modification)
- Maintenance overhead
- Potential query optimizer confusion

Oracle prevents this with an error like ORA-01408: such column list already
indexed [1].

I propose that PostgreSQL prevent redundant index creation by:

- Checking for identical existing indexes during CREATE INDEX.
- Failing with an error (like Oracle's ORA-01408) if a duplicate is found.
- Providing a GUC parameter (allow_redundant_indexes) to control this.

This change would:

- Prevent accidental redundancy
- Optimize storage
- Improve performance
- Simplify maintenance
- Enhance efficiency and user flexibility

I’d love to hear your feedback or suggestions for improvement.

[1] https://docs.oracle.com/en/error-help/db/ora-01408/?r=19c

--
Regrads,
Japin Li



Re: Disallow redundant indexes

From
David Rowley
Date:
On Thu, 24 Apr 2025 at 21:27, Japin Li <japinli@hotmail.com> wrote:
> I propose that PostgreSQL prevent redundant index creation by:
>
> - Checking for identical existing indexes during CREATE INDEX.
> - Failing with an error (like Oracle's ORA-01408) if a duplicate is found.
> - Providing a GUC parameter (allow_redundant_indexes) to control this.

> I’d love to hear your feedback or suggestions for improvement.

Sounds like pointless nannying to me. Also, we don't want GUCs that
change the way things work. It's too annoying for application
developers. I don't have a complete picture of the history of it, but
I believe we still have quite a mess as a result of some GUC-dependent
behaviour. Check the mess around backslash_quote,
standard_conforming_strings and escape_string_warning.

In any case, who are we to define what a duplicate index is? Would
creating a hash index on a column that's already part of a btree index
be disallowed? How about creating an index on (col1) when there's
already an index on (col1,col2)? One person might think that's a waste
of space, and someone else might think it's useful to have the (col1)
index to support faster index-only scans when only that column is
needed.

I get that we have REINDEX CONCURRENTLY now and there's less of a need
to recreate another duplicate index CONCURRENTLY before dropping the
old one, but aren't there still reasons to create a duplicate index
concurrently, e.g to move an index to another tablespace without
blocking queries.

If you want to do this, then maybe just write some query that looks at
pg_index to find duplicates and stick it on the wiki. Anyone who cares
enough about this can run that to check. Oh wait, someone did that
already, see [1].

David

[1] https://wiki.postgresql.org/wiki/Index_Maintenance#Duplicate_indexes



Re: Disallow redundant indexes

From
Greg Sabino Mullane
Date:
On Thu, Apr 24, 2025 at 7:31 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 24 Apr 2025 at 21:27, Japin Li <japinli@hotmail.com> wrote:
> I propose that PostgreSQL prevent redundant index creation by:
 
In any case, who are we to define what a duplicate index is?

I think this part is easier than you make it sound: everything (except the name) is exactly the same as an existing index. That's the 99% case we are trying to catch here.

I've had this idea before, and even wrote a quick POC at one point, but I had it simply throw a warning rather than an error. That avoids the need for any GUC, which I agree is not a good idea. And it still allows people to create a duplicate index if they really want to.

Cheers,
Greg

--
Enterprise Postgres Software Products & Tech Support

Re: Disallow redundant indexes

From
Japin Li
Date:
Hi, Greg and David

Thank you for your feedback.

On Thu, 24 Apr 2025 at 08:26, Greg Sabino Mullane <htamfids@gmail.com> wrote:
> On Thu, Apr 24, 2025 at 7:31 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
>  On Thu, 24 Apr 2025 at 21:27, Japin Li <japinli@hotmail.com> wrote:
>  > I propose that PostgreSQL prevent redundant index creation by:
>
>
>  In any case, who are we to define what a duplicate index is?
>

You're absolutely right. Defining a duplicate index is indeed simpler than
I initially described.

> I think this part is easier than you make it sound: everything (except the name) is exactly the same as an existing
> index. That's the 99% case we are trying to catch here.
>

As Greg pointed out, if everything except the name is identical to an existing
index, it should be considered a duplicate in most cases (the 99% case).
This is precisely the scenario I'm aiming to prevent.

> I've had this idea before, and even wrote a quick POC at one point, but I had it simply throw a warning rather than
an
> error. That avoids the need for any GUC, which I agree is not a good idea. And it still allows people to create a
> duplicate index if they really want to.
>

I also appreciate your suggestion regarding the GUC parameter.  You've
convinced me that a warning might be a more appropriate approach.  A warning
would still alert users to the potential issue of creating a redundant index,
while allowing them to proceed if they have a specific reason to do so.

--
Regrads,
Japin Li



Re: Disallow redundant indexes

From
Corey Huinker
Date:
> I've had this idea before, and even wrote a quick POC at one point, but I had it simply throw a warning rather than an
> error. That avoids the need for any GUC, which I agree is not a good idea. And it still allows people to create a
> duplicate index if they really want to.
>

I also appreciate your suggestion regarding the GUC parameter.  You've
convinced me that a warning might be a more appropriate approach.  A warning
would still alert users to the potential issue of creating a redundant index,
while allowing them to proceed if they have a specific reason to do so.

+1 to a warning.

One reason they might want to create a duplicate index is to reduce index bloat, especially on partial indexes. Granted, we've had REINDEX CONCURRENTLY since v12, but some reindexing workflows are older than that, and I've seen a few that put a "freshness date" into the index name as a hint to future DBAs about the urgency of the next reindexing. A canceled reindex concurrently could in the past (and maybe still can) leave an invalid index of indeterminate name, and therefore harder to clean up, so that's a possible reason to prefer duplicate-then-swap index creation over reindex.


 

Re: Disallow redundant indexes

From
Sami Imseih
Date:
> > > I've had this idea before, and even wrote a quick POC at one point, but I had it simply throw a warning rather
thanan
 
> > > error. That avoids the need for any GUC, which I agree is not a good idea. And it still allows people to create
a
> > > duplicate index if they really want to.
> > >
> >
> > I also appreciate your suggestion regarding the GUC parameter.  You've
> > convinced me that a warning might be a more appropriate approach.  A warning
> > would still alert users to the potential issue of creating a redundant index,
> > while allowing them to proceed if they have a specific reason to do so.

> +1 to a warning.

If I had a dollar for every time I asked a user to check for duplicate
indexes :) So
I think it's a good idea to provide some way to detect these ( besides
a query in the
Wiki ), but I don't think we should prevent it.

While a WARNING will be a good to have, it could easily go unnoticed,
but it's still good to have.

I also think we should either provide a psql shortcut to detect these indexes or
to add annotation to an index in the \d <table> command, or perhaps both of
these things.


--
Sami Imseih
Amazon Web Services (AWS)



Re: Disallow redundant indexes

From
Nathan Bossart
Date:
On Mon, Apr 28, 2025 at 01:12:18PM -0500, Sami Imseih wrote:
> If I had a dollar for every time I asked a user to check for duplicate
> indexes :) So
> I think it's a good idea to provide some way to detect these ( besides
> a query in the
> Wiki ), but I don't think we should prevent it.
> 
> While a WARNING will be a good to have, it could easily go unnoticed,
> but it's still good to have.

I don't doubt that this is a problem in the field, but I'm not sure I'm
convinced it's worthwhile to emit a warning that only appears for exact
duplicates and is very easily missed, not to mention that the user could be
intentionally creating the index.

-- 
nathan



Re: Disallow redundant indexes

From
"Euler Taveira"
Date:
On Mon, Apr 28, 2025, at 7:17 PM, Nathan Bossart wrote:
On Mon, Apr 28, 2025 at 01:12:18PM -0500, Sami Imseih wrote:
> If I had a dollar for every time I asked a user to check for duplicate
> indexes :) So
> I think it's a good idea to provide some way to detect these ( besides
> a query in the
> Wiki ), but I don't think we should prevent it.
> While a WARNING will be a good to have, it could easily go unnoticed,
> but it's still good to have.

I don't doubt that this is a problem in the field, but I'm not sure I'm
convinced it's worthwhile to emit a warning that only appears for exact
duplicates and is very easily missed, not to mention that the user could be
intentionally creating the index.

I don't like to idea of a new GUC to control an SQL command. The WARNING seems
fragile because it doesn't solve the problem. I also don't want to disallow
duplicate indexes; it can break old routines. If we want to do something about
it, I would say let's add a psql meta command or even a new view to show
duplicate indexes. I do think that official maintenance queries are a good to
step forward to make life easier for DBAs.


--
Euler Taveira

Re: Disallow redundant indexes

From
Sami Imseih
Date:
> On Mon, Apr 28, 2025 at 01:12:18PM -0500, Sami Imseih wrote:
> > If I had a dollar for every time I asked a user to check for duplicate
> > indexes :) So
> > I think it's a good idea to provide some way to detect these ( besides
> > a query in the
> > Wiki ), but I don't think we should prevent it.
> >
> > While a WARNING will be a good to have, it could easily go unnoticed,
> > but it's still good to have.
>
> I don't doubt that this is a problem in the field, but I'm not sure I'm
> convinced it's worthwhile to emit a warning that only appears for exact
> duplicates and is very easily missed, not to mention that the user could be
> intentionally creating the index.
>
>
> I don't like to idea of a new GUC to control an SQL command. The WARNING seems
> fragile because it doesn't solve the problem. I also don't want to disallow
> duplicate indexes; it can break old routines. If we want to do something about
> it, I would say let's add a psql meta command or even a new view to show
> duplicate indexes. I do think that official maintenance queries are a good to
> step forward to make life easier for DBAs.

I do think adding an annotation to the index in the \d[+] psql
meta-command will be
very useful when there are indexes that are the same except for the
name (and do
not enforce constraints?). This will at least help DBAs quickly spot
such indexes in
their tables.

--
Sami