Re: Bug with indexes on whole-row expressions - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Bug with indexes on whole-row expressions
Date
Msg-id CAExHW5sisnxbU_uXu+-7G3gKQ5oOJpDVn50=8+mcrdoa3+1peg@mail.gmail.com
Whole thread Raw
In response to Bug with indexes on whole-row expressions  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-hackers
On Mon, Jun 29, 2020 at 3:43 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> CREATE TABLE boom (a integer, b integer);
>
> -- index on whole-row expression
> CREATE UNIQUE INDEX ON boom ((boom));
>
> INSERT INTO boom VALUES
>    (1, 2),
>    (1, 3);
>
> ALTER TABLE boom DROP b;
>
> TABLE boom;
>
>  a
> ---
>  1
>  1
> (2 rows)
>
> REINDEX TABLE boom;
> ERROR:  could not create unique index "boom_boom_idx"
> DETAIL:  Key ((boom.*))=((1)) is duplicated.
>
> The problem here is that there *is* a "pg_depend" entry for the
> index, but it only depends on the whole table, not on specific columns.
>
> I have been thinking what would be the right approach to fix this:
>
> 1. Don't fix it, because it is an artificial corner case.
>    (But I can imagine someone trying to exclude duplicate rows with
>    a unique index.)
>
> 2. Add code that checks if there is an index with a whole-row reference
>    in the definition before dropping a column.
>    That feels like a wart for a special case.

Do we need to do something about adding a new column or modifying an
existing one. Esp. if the later changes the uniqueness of row
expression.
>
> 3. Forbid indexes on whole-row expressions.
>    After all, you can do the same with an index on all the columns.
>    That would open the question what to do about upgrading old databases
>    that might have such indexes today.

This would be the best case. However, a whole row expression is not
necessarily the same as "all columns" at that moment. A whole row
expression means whatever column there are at any given point in time.
So creating an index on whole row expression is not the same as an
index on all the columns. However, I can't imagine a case where we
want an index on "whole row expression". An index containing all the
columns would always suffice and will be deterministic as well.

So this option looks best.

>
> 4. Add dependencies on all columns whenever a whole-row expression
>    is used in an index.
>    That would need special processing for pg_upgrade.

Again, that dependency needs to be maintained as and when the columns
are added and dropped.


--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: "Andrey V. Lepikhov"
Date:
Subject: Re: POC and rebased patch for CSN based snapshots
Next
From: Ashutosh Bapat
Date:
Subject: Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away