Thread: Bogus collation version recording in recordMultipleDependencies

Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

14 April 2021, 17:18:07

I noticed some broken-looking logic in recordMultipleDependencies
concerning how it records collation versions.  It was a bit harder
than I expected to demonstrate the bugs, but I eventually succeeded
with

u8=# create function foo(varchar) returns bool language sql return false;
CREATE FUNCTION
u8=# create collation mycoll from "en_US";
CREATE COLLATION
u8=# CREATE DOMAIN d4 AS character varying(3) COLLATE "aa_DJ"
    CONSTRAINT yes_or_no_check CHECK (value = 'YES' collate mycoll or foo(value));
CREATE DOMAIN
u8=# select objid, pg_describe_object(classid,objid,objsubid) as obj,
pg_describe_object(refclassid,refobjid,refobjsubid)as ref, deptype, refobjversion from pg_depend where objid =
'd4'::regtype;
 objid |   obj   |        ref        | deptype | refobjversion
-------+---------+-------------------+---------+---------------
 37421 | type d4 | schema public     | n       |
 37421 | type d4 | collation "aa_DJ" | n       |
(2 rows)

u8=# select objid, pg_describe_object(classid,objid,objsubid) as obj,
pg_describe_object(refclassid,refobjid,refobjsubid)as ref, deptype, refobjversion from pg_depend where refobjid =
'd4'::regtype;
 objid |            obj             |   ref   | deptype | refobjversion
-------+----------------------------+---------+---------+---------------
 37420 | type d4[]                  | type d4 | i       |
 37422 | constraint yes_or_no_check | type d4 | a       |
(2 rows)

u8=# select objid, pg_describe_object(classid,objid,objsubid) as obj,
pg_describe_object(refclassid,refobjid,refobjsubid)as ref, deptype, refobjversion from pg_depend where objid = 37422; 
 objid |            obj             |               ref               | deptype | refobjversion
-------+----------------------------+---------------------------------+---------+---------------
 37422 | constraint yes_or_no_check | type d4                         | a       |
 37422 | constraint yes_or_no_check | collation mycoll                | n       | 2.28
 37422 | constraint yes_or_no_check | function foo(character varying) | n       | 2.28
 37422 | constraint yes_or_no_check | collation "default"             | n       |
(4 rows)

(This is in a glibc-based build, with C as the database's default
collation.)

One question here is whether it's correct that the domain's dependency
on collation "aa_DJ" is unversioned.  Maybe that's intentional, but it
seems worth asking.

Anyway, there are two pretty obvious bugs in the dependencies for the
domain's CHECK constraint: the version for collation mycoll leaks
into the entry for function foo, and an entirely useless (because
unversioned) dependency is recorded on the default collation.

... well, it's almost entirely useless.  If we fix things to not do that
(as per patch 0001 below), the results of the create_index regression
test become unstable, because there's two queries that inquire into the
dependencies of indexes, and their results change depending on whether
the default collation has a version or not.  I'd be inclined to just
take out the portions of that test that depend on that question, but
maybe somebody will complain that there's a loss of useful coverage.
I don't agree, but maybe I'll be overruled.

If we do feel we need to stay bug-compatible with that behavior, then
the alternate 0002 patch just fixes the version-leakage-across-entries
problem, while still removing the unnecessary assumption that C, POSIX,
and DEFAULT are the only pinned collations.

(To be clear: 0002 passes check-world as-is, while 0001 is not
committable without some regression-test fiddling.)

Thoughts?

            regards, tom lane

diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 362db7fe91..1217c01b8a 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -73,7 +73,6 @@ recordMultipleDependencies(const ObjectAddress *depender,
                 max_slots,
                 slot_init_count,
                 slot_stored_count;
-    char       *version = NULL;

     if (nreferenced <= 0)
         return;                    /* nothing to do */
@@ -104,31 +103,22 @@ recordMultipleDependencies(const ObjectAddress *depender,
     slot_init_count = 0;
     for (i = 0; i < nreferenced; i++, referenced++)
     {
-        bool        ignore_systempin = false;
+        char       *version = NULL;

         if (record_version)
         {
             /* For now we only know how to deal with collations. */
             if (referenced->classId == CollationRelationId)
             {
-                /* C and POSIX don't need version tracking. */
+                /* These are unversioned, so don't waste cycles on them. */
                 if (referenced->objectId == C_COLLATION_OID ||
                     referenced->objectId == POSIX_COLLATION_OID)
                     continue;

                 version = get_collation_version_for_oid(referenced->objectId,
                                                         false);
-
-                /*
-                 * Default collation is pinned, so we need to force recording
-                 * the dependency to store the version.
-                 */
-                if (referenced->objectId == DEFAULT_COLLATION_OID)
-                    ignore_systempin = true;
             }
         }
-        else
-            Assert(!version);

         /*
          * If the referenced object is pinned by the system, there's no real
@@ -136,7 +126,7 @@ recordMultipleDependencies(const ObjectAddress *depender,
          * version.  This saves lots of space in pg_depend, so it's worth the
          * time taken to check.
          */
-        if (!ignore_systempin && isObjectPinned(referenced, dependDesc))
+        if (version == NULL && isObjectPinned(referenced, dependDesc))
             continue;

         if (slot_init_count < max_slots)
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 362db7fe91..174ce4b7df 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -73,7 +73,6 @@ recordMultipleDependencies(const ObjectAddress *depender,
                 max_slots,
                 slot_init_count,
                 slot_stored_count;
-    char       *version = NULL;

     if (nreferenced <= 0)
         return;                    /* nothing to do */
@@ -104,6 +103,7 @@ recordMultipleDependencies(const ObjectAddress *depender,
     slot_init_count = 0;
     for (i = 0; i < nreferenced; i++, referenced++)
     {
+        char       *version = NULL;
         bool        ignore_systempin = false;

         if (record_version)
@@ -111,7 +111,7 @@ recordMultipleDependencies(const ObjectAddress *depender,
             /* For now we only know how to deal with collations. */
             if (referenced->classId == CollationRelationId)
             {
-                /* C and POSIX don't need version tracking. */
+                /* These are unversioned, so don't waste cycles on them. */
                 if (referenced->objectId == C_COLLATION_OID ||
                     referenced->objectId == POSIX_COLLATION_OID)
                     continue;
@@ -120,15 +120,16 @@ recordMultipleDependencies(const ObjectAddress *depender,
                                                         false);

                 /*
-                 * Default collation is pinned, so we need to force recording
-                 * the dependency to store the version.
+                 * If we have a version, make sure we record it even if the
+                 * collation is pinned.  Also force recording a dependency on
+                 * the "default" collation even if it hasn't got a version;
+                 * this is an ugly kluge to ensure stability of regression
+                 * test results.
                  */
-                if (referenced->objectId == DEFAULT_COLLATION_OID)
+                if (version || referenced->objectId == DEFAULT_COLLATION_OID)
                     ignore_systempin = true;
             }
         }
-        else
-            Assert(!version);

         /*
          * If the referenced object is pinned by the system, there's no real

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

15 April 2021, 10:56:47

On Wed, Apr 14, 2021 at 01:18:07PM -0400, Tom Lane wrote:
> 
> One question here is whether it's correct that the domain's dependency
> on collation "aa_DJ" is unversioned.  Maybe that's intentional, but it
> seems worth asking.

This is intentional I think, we should record collation version only for object
that might break if the collation version is updated.  So creating an index on
that domain would record the collation version.

> Anyway, there are two pretty obvious bugs in the dependencies for the
> domain's CHECK constraint: the version for collation mycoll leaks
> into the entry for function foo, and an entirely useless (because
> unversioned) dependency is recorded on the default collation.

Agreed.

> (To be clear: 0002 passes check-world as-is, while 0001 is not
> committable without some regression-test fiddling.)

I'm probably missing something obvious but both 0001 and 0002 pass check-world
for me, on a glibc box and --with-icu.

> Thoughts?

I think this is an open item, so I added one for now.

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

15 April 2021, 14:06:24

Julien Rouhaud <rjuju123@gmail.com> writes:
> On Wed, Apr 14, 2021 at 01:18:07PM -0400, Tom Lane wrote:
>> (To be clear: 0002 passes check-world as-is, while 0001 is not
>> committable without some regression-test fiddling.)

> I'm probably missing something obvious but both 0001 and 0002 pass check-world
> for me, on a glibc box and --with-icu.

0001 fails for me :-(.  I think that requires default collation to be C.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

16 April 2021, 02:56:05

On Thu, Apr 15, 2021 at 10:06:24AM -0400, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > On Wed, Apr 14, 2021 at 01:18:07PM -0400, Tom Lane wrote:
> >> (To be clear: 0002 passes check-world as-is, while 0001 is not
> >> committable without some regression-test fiddling.)
> 
> > I'm probably missing something obvious but both 0001 and 0002 pass check-world
> > for me, on a glibc box and --with-icu.
> 
> 0001 fails for me :-(.  I think that requires default collation to be C.

Oh right, adding --no-locale to the regress opts I see that create_index is
failing, and that's not the one I was expecting.

We could change create_index test to create c2 with a C collation, in order to
test that we don't track dependency on unversioned locales, and add an extra
test in collate.linux.utf8 to check that we do track a dependency on the
default collation as this test isn't run in the --no-locale case.  The only
case not tested would be default unversioned collation, but I'm not sure where
to properly test that.  Maybe a short leading test in collate.linux.utf8 that
would be run on linux in that case (when getdatabaseencoding() != 'UTF8')?  It
would require an extra alternate file but it wouldn't cause too much
maintenance problem as there should be only one test.

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 14:03:42

Julien Rouhaud <rjuju123@gmail.com> writes:
> On Thu, Apr 15, 2021 at 10:06:24AM -0400, Tom Lane wrote:
>> 0001 fails for me :-(.  I think that requires default collation to be C.

> Oh right, adding --no-locale to the regress opts I see that create_index is
> failing, and that's not the one I was expecting.

> We could change create_index test to create c2 with a C collation, in order to
> test that we don't track dependency on unversioned locales, and add an extra
> test in collate.linux.utf8 to check that we do track a dependency on the
> default collation as this test isn't run in the --no-locale case.  The only
> case not tested would be default unversioned collation, but I'm not sure where
> to properly test that.  Maybe a short leading test in collate.linux.utf8 that
> would be run on linux in that case (when getdatabaseencoding() != 'UTF8')?  It
> would require an extra alternate file but it wouldn't cause too much
> maintenance problem as there should be only one test.

Since the proposed patch removes the dependency code's special-case
handling of the default collation, I don't feel like we need to jump
through hoops to prove that the default collation is tracked the
same as other collations.  A regression test with alternative outputs
is a significant ongoing maintenance burden, and I do not see where
we're getting a commensurate improvement in test coverage.  Especially
since, AFAICS, the two alternative outputs would essentially have to
accept both the "it works" and "it doesn't work" outcomes.

So I propose that we do 0001 below, which is my first patch plus your
suggestion about fixing up create_index.sql.  This passes check-world
for me under both C and en_US.utf8 prevailing locales.

            regards, tom lane

diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index 362db7fe91..1217c01b8a 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -73,7 +73,6 @@ recordMultipleDependencies(const ObjectAddress *depender,
                 max_slots,
                 slot_init_count,
                 slot_stored_count;
-    char       *version = NULL;

     if (nreferenced <= 0)
         return;                    /* nothing to do */
@@ -104,31 +103,22 @@ recordMultipleDependencies(const ObjectAddress *depender,
     slot_init_count = 0;
     for (i = 0; i < nreferenced; i++, referenced++)
     {
-        bool        ignore_systempin = false;
+        char       *version = NULL;

         if (record_version)
         {
             /* For now we only know how to deal with collations. */
             if (referenced->classId == CollationRelationId)
             {
-                /* C and POSIX don't need version tracking. */
+                /* These are unversioned, so don't waste cycles on them. */
                 if (referenced->objectId == C_COLLATION_OID ||
                     referenced->objectId == POSIX_COLLATION_OID)
                     continue;

                 version = get_collation_version_for_oid(referenced->objectId,
                                                         false);
-
-                /*
-                 * Default collation is pinned, so we need to force recording
-                 * the dependency to store the version.
-                 */
-                if (referenced->objectId == DEFAULT_COLLATION_OID)
-                    ignore_systempin = true;
             }
         }
-        else
-            Assert(!version);

         /*
          * If the referenced object is pinned by the system, there's no real
@@ -136,7 +126,7 @@ recordMultipleDependencies(const ObjectAddress *depender,
          * version.  This saves lots of space in pg_depend, so it's worth the
          * time taken to check.
          */
-        if (!ignore_systempin && isObjectPinned(referenced, dependDesc))
+        if (version == NULL && isObjectPinned(referenced, dependDesc))
             continue;

         if (slot_init_count < max_slots)
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 830fdddf24..7f8f91b92c 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2026,10 +2026,10 @@ REINDEX TABLE concur_reindex_tab; -- notice
 NOTICE:  table "concur_reindex_tab" has no indexes to reindex
 REINDEX (CONCURRENTLY) TABLE concur_reindex_tab; -- notice
 NOTICE:  table "concur_reindex_tab" has no indexes that can be reindexed concurrently
-ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text COLLATE "C"; -- add toast index
 -- Normal index with integer column
 CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
--- Normal index with text column
+-- Normal index with text column (with unversioned collation)
 CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
 -- UNIQUE index with expression
 CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
@@ -2069,16 +2069,14 @@ WHERE classid = 'pg_class'::regclass AND
                    obj                    |                           objref                           | deptype
 ------------------------------------------+------------------------------------------------------------+---------
  index concur_reindex_ind1                | constraint concur_reindex_ind1 on table concur_reindex_tab | i
- index concur_reindex_ind2                | collation "default"                                        | n
  index concur_reindex_ind2                | column c2 of table concur_reindex_tab                      | a
  index concur_reindex_ind3                | column c1 of table concur_reindex_tab                      | a
  index concur_reindex_ind3                | table concur_reindex_tab                                   | a
- index concur_reindex_ind4                | collation "default"                                        | n
  index concur_reindex_ind4                | column c1 of table concur_reindex_tab                      | a
  index concur_reindex_ind4                | column c2 of table concur_reindex_tab                      | a
  materialized view concur_reindex_matview | schema public                                              | n
  table concur_reindex_tab                 | schema public                                              | n
-(10 rows)
+(8 rows)

 REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
 REINDEX TABLE CONCURRENTLY concur_reindex_tab;
@@ -2098,16 +2096,14 @@ WHERE classid = 'pg_class'::regclass AND
                    obj                    |                           objref                           | deptype
 ------------------------------------------+------------------------------------------------------------+---------
  index concur_reindex_ind1                | constraint concur_reindex_ind1 on table concur_reindex_tab | i
- index concur_reindex_ind2                | collation "default"                                        | n
  index concur_reindex_ind2                | column c2 of table concur_reindex_tab                      | a
  index concur_reindex_ind3                | column c1 of table concur_reindex_tab                      | a
  index concur_reindex_ind3                | table concur_reindex_tab                                   | a
- index concur_reindex_ind4                | collation "default"                                        | n
  index concur_reindex_ind4                | column c1 of table concur_reindex_tab                      | a
  index concur_reindex_ind4                | column c2 of table concur_reindex_tab                      | a
  materialized view concur_reindex_matview | schema public                                              | n
  table concur_reindex_tab                 | schema public                                              | n
-(10 rows)
+(8 rows)

 -- Check that comments are preserved
 CREATE TABLE testcomment (i int);
@@ -2487,7 +2483,7 @@ WARNING:  cannot reindex system catalogs concurrently, skipping all
  Column |  Type   | Collation | Nullable | Default
 --------+---------+-----------+----------+---------
  c1     | integer |           | not null |
- c2     | text    |           |          |
+ c2     | text    | C         |          |
 Indexes:
     "concur_reindex_ind1" PRIMARY KEY, btree (c1)
     "concur_reindex_ind2" btree (c2)
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 8bc76f7c6f..51c9a12151 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -797,10 +797,10 @@ CREATE TABLE concur_reindex_tab (c1 int);
 -- REINDEX
 REINDEX TABLE concur_reindex_tab; -- notice
 REINDEX (CONCURRENTLY) TABLE concur_reindex_tab; -- notice
-ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text COLLATE "C"; -- add toast index
 -- Normal index with integer column
 CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
--- Normal index with text column
+-- Normal index with text column (with unversioned collation)
 CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
 -- UNIQUE index with expression
 CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

16 April 2021, 15:55:35

On Fri, Apr 16, 2021 at 10:03:42AM -0400, Tom Lane wrote:
> 
> Since the proposed patch removes the dependency code's special-case
> handling of the default collation, I don't feel like we need to jump
> through hoops to prove that the default collation is tracked the
> same as other collations.  A regression test with alternative outputs
> is a significant ongoing maintenance burden, and I do not see where
> we're getting a commensurate improvement in test coverage.  Especially
> since, AFAICS, the two alternative outputs would essentially have to
> accept both the "it works" and "it doesn't work" outcomes.

Fine by me, I was mentioning those if we wanted to keep some extra coverage for
that by I agree it doesn't add much value.

> So I propose that we do 0001 below, which is my first patch plus your
> suggestion about fixing up create_index.sql.  This passes check-world
> for me under both C and en_US.utf8 prevailing locales.

That's what I ended up with too, so LGTM!

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 16:27:56

Julien Rouhaud <rjuju123@gmail.com> writes:
> On Fri, Apr 16, 2021 at 10:03:42AM -0400, Tom Lane wrote:
>> So I propose that we do 0001 below, which is my first patch plus your
>> suggestion about fixing up create_index.sql.  This passes check-world
>> for me under both C and en_US.utf8 prevailing locales.

> That's what I ended up with too, so LGTM!

Pushed, thanks for review!  (and I'll update the open items list in a
sec)

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 16:55:28

I wrote:
>> That's what I ended up with too, so LGTM!

> Pushed, thanks for review!  (and I'll update the open items list in a
> sec)

... or maybe not just yet.  Andres' buildfarm critters seem to have
a different opinion than my machine about what the output of
collate.icu.utf8 ought to be.  I wonder what the prevailing LANG
setting is for them, and which ICU version they're using.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Andres Freund

Date:

16 April 2021, 17:04:16

Hi,

On 2021-04-16 12:55:28 -0400, Tom Lane wrote:
> I wrote:
> >> That's what I ended up with too, so LGTM!
> 
> > Pushed, thanks for review!  (and I'll update the open items list in a
> > sec)
> 
> ... or maybe not just yet.  Andres' buildfarm critters seem to have
> a different opinion than my machine about what the output of
> collate.icu.utf8 ought to be.  I wonder what the prevailing LANG
> setting is for them, and which ICU version they're using.

andres@andres-pg-buildfarm-valgrind:~/src/pgbuildfarm-client-stock$ grep calliph *.conf
build-farm-copyparse.conf:    animal => "calliphoridae",
build-farm-copyparse.conf:    build_root => '/mnt/resource/andres/bf/calliphoridae',

andres@andres-pg-buildfarm-valgrind:~/src/pgbuildfarm-client-stock$ dpkg -l|grep icu
ii  icu-devtools                         67.1-6                         amd64        Development utilities for
InternationalComponents for Unicode
 
ii  libicu-dev:amd64                     67.1-6                         amd64        Development files for
InternationalComponents for Unicode
 
ii  libicu67:amd64                       67.1-6                         amd64        International Components for
Unicode

andres@andres-pg-buildfarm-valgrind:~/src/pgbuildfarm-client-stock$ locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=

Greetings,

Andres Freund

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 17:07:52

I wrote:
> ... or maybe not just yet.  Andres' buildfarm critters seem to have
> a different opinion than my machine about what the output of
> collate.icu.utf8 ought to be.  I wonder what the prevailing LANG
> setting is for them, and which ICU version they're using.

Oh, I bet it's "C.utf8", because I can reproduce the failure with that.
This crystallizes a nagging feeling I'd had that you were misdescribing
the collate.icu.utf8 test as not being run under --no-locale.  Actually,
it's only skipped if the encoding isn't UTF8, not the same thing.
I think we need to remove the default-collation cases from that test too.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 17:13:54

Andres Freund <andres@anarazel.de> writes:
> On 2021-04-16 12:55:28 -0400, Tom Lane wrote:
>> ... or maybe not just yet.  Andres' buildfarm critters seem to have
>> a different opinion than my machine about what the output of
>> collate.icu.utf8 ought to be.  I wonder what the prevailing LANG
>> setting is for them, and which ICU version they're using.

> LANG=C.UTF-8

I'd guessed that shortly later, but thanks for confirming.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 17:45:49

I wrote:
> Oh, I bet it's "C.utf8", because I can reproduce the failure with that.
> This crystallizes a nagging feeling I'd had that you were misdescribing
> the collate.icu.utf8 test as not being run under --no-locale.  Actually,
> it's only skipped if the encoding isn't UTF8, not the same thing.
> I think we need to remove the default-collation cases from that test too.

Hmm ... this is more subtle than it seemed.

I tried to figure out where the default-collation dependencies were coming
from, and it's quite non-obvious, at least for some of them.  Observe:

u8de=# create table t1 (f1 text collate "fr_FR");
CREATE TABLE
u8de=# create index on t1(f1) where f1 > 'foo';
CREATE INDEX
u8de=# SELECT objid::regclass, refobjid::regcollation, refobjversion
FROM pg_depend d
LEFT JOIN pg_class c ON c.oid = d.objid
WHERE refclassid = 'pg_collation'::regclass
AND coalesce(relkind, 'i') = 'i'
AND relname LIKE 't1_%';
   objid   | refobjid  | refobjversion 
-----------+-----------+---------------
 t1_f1_idx | "fr_FR"   | 2.28
 t1_f1_idx | "fr_FR"   | 2.28
 t1_f1_idx | "default" | 2.28
(3 rows)

(The "default" item doesn't show up if default collation is C,
which is what's causing the buildfarm instability.)

Now, it certainly looks like that index definition ought to only
have fr_FR dependencies.  I dug into it and discovered that the
reason we're coming up with a dependency on "default" is that
the WHERE clause looks like

             {OPEXPR 
             :opno 666 
             :opfuncid 742 
             :opresulttype 16 
             :opretset false 
             :opcollid 0 
             :inputcollid 14484 
             :args (
                {VAR 
                :varno 1 
                :varattno 1 
                :vartype 25 
                :vartypmod -1 
                :varcollid 14484 
                :varlevelsup 0 
                :varnosyn 1 
                :varattnosyn 1 
                :location 23
                }
                {CONST 
                :consttype 25 
                :consttypmod -1 
                :constcollid 100 
                :constlen -1 
                :constbyval false 
                :constisnull false 
                :location 28 
                :constvalue 7 [ 28 0 0 0 102 111 111 ]
                }
             )
             :location 26
             }

So sure enough, the comparison operator's inputcollid is
fr_FR, but the 'foo' constant has constcollid = "default".
That will have exactly zero impact on the semantics of the
expression, but dependency.c doesn't realize that and
reports it as a dependency anyway.

I feel like this is telling us that there's a fundamental
misunderstanding in find_expr_references_walker about which
collation dependencies to report.  It's reporting all the
leaf-node collations, and ignoring the ones that actually
count semantically, that is the inputcollid fields of
function and operator nodes.

Not sure what's the best thing to do here.  Redesigning
this post-feature-freeze doesn't seem terribly appetizing,
but on the other hand, this index collation recording
feature has put a premium on not overstating the collation
dependencies of an expression.  We don't want to tell users
that an index is broken when it isn't really.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 20:39:39

I wrote:
> I feel like this is telling us that there's a fundamental
> misunderstanding in find_expr_references_walker about which
> collation dependencies to report.  It's reporting all the
> leaf-node collations, and ignoring the ones that actually
> count semantically, that is the inputcollid fields of
> function and operator nodes.
> Not sure what's the best thing to do here.  Redesigning
> this post-feature-freeze doesn't seem terribly appetizing,
> but on the other hand, this index collation recording
> feature has put a premium on not overstating the collation
> dependencies of an expression.  We don't want to tell users
> that an index is broken when it isn't really.

I felt less hesitant to modify find_expr_references_walker's
behavior w.r.t. collations after realizing that most of it
was not of long standing, but came in with 257836a75.
So here's a draft patch that redesigns it as suggested above.
Along the way I discovered that GetTypeCollations was quite
broken for ranges and arrays, so this fixes that too.

Per the changes in collate.icu.utf8.out, this gets rid of
a lot of imaginary collation dependencies, but it also gets
rid of some arguably-real ones.  In particular, calls of
record_eq and its siblings will be considered not to have
any collation dependencies, although we know that internally
those will look up per-column collations of their input types.
We could imagine special-casing record_eq etc here, but that
sure seems like a hack.

I"m starting to have a bad feeling about 257836a75 overall.
As I think I've complained before, I do not like anything about
what it's done to pg_depend; it's forcing that relation to serve
two masters, neither one well.  We now see that the same remark
applies to find_expr_references(), because the semantics of
"which collations does this expression's behavior depend on" aren't
identical to "which collations need to be recorded as direct
dependencies of this expression", especially not if you'd prefer
to minimize either list.  (Which is important.)  Moreover, for all
the complexity it's introducing, it's next door to useless for
glibc collations --- we might as well tell people "reindex
everything when your glibc version changes", which could be done
with a heck of a lot less infrastructure.  The situation on Windows
looks pretty user-unfriendly as well, per the other thread.

So I wonder if, rather than continuing to pursue this right now,
we shouldn't revert 257836a75 and try again later with a new design
that doesn't try to commandeer the existing dependency infrastructure.
We might have a better idea about what to do on Windows by the time
that's done, too.

            regards, tom lane

diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 8d8e926c21..41093ea6ae 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1835,8 +1835,17 @@ recordDependencyOnSingleRelExpr(const ObjectAddress *depender,
  * the datatype.  However we do need a type dependency if there is no such
  * indirect dependency, as for example in Const and CoerceToDomain nodes.
  *
- * Similarly, we don't need to create dependencies on collations except where
- * the collation is being freshly introduced to the expression.
+ * Collations are handled primarily by recording the inputcollid's of node
+ * types that have them, as those are the ones that are semantically
+ * significant during expression evaluation.  We also record the collation of
+ * CollateExpr nodes, since those will be needed to print such nodes even if
+ * they don't really affect semantics.  Collations of leaf nodes such as Vars
+ * can be ignored on the grounds that if they're not default, they came from
+ * the referenced object (e.g., a table column), so the dependency on that
+ * object is enough.  (Note: in a post-const-folding expression tree, a
+ * CollateExpr's collation could have been absorbed into a Const or
+ * RelabelType node.  While ruleutils.c prints such collations for clarity,
+ * we may ignore them here as they have no semantic effect.)
  */
 static bool
 find_expr_references_walker(Node *node,
@@ -1876,29 +1885,6 @@ find_expr_references_walker(Node *node,
             /* If it's a plain relation, reference this column */
             add_object_address(OCLASS_CLASS, rte->relid, var->varattno,
                                context->addrs);
-
-            /* Top-level collation if valid */
-            if (OidIsValid(var->varcollid))
-                add_object_address(OCLASS_COLLATION, var->varcollid, 0,
-                                   context->addrs);
-            /* Otherwise, it may be a type with internal collations */
-            else if (var->vartype >= FirstNormalObjectId)
-            {
-                List       *collations;
-                ListCell   *lc;
-
-                collations = GetTypeCollations(var->vartype);
-
-                foreach(lc, collations)
-                {
-                    Oid            coll = lfirst_oid(lc);
-
-                    if (OidIsValid(coll))
-                        add_object_address(OCLASS_COLLATION,
-                                           lfirst_oid(lc), 0,
-                                           context->addrs);
-                }
-            }
         }

         /*
@@ -1920,15 +1906,6 @@ find_expr_references_walker(Node *node,
         add_object_address(OCLASS_TYPE, con->consttype, 0,
                            context->addrs);

-        /*
-         * We must also depend on the constant's collation: it could be
-         * different from the datatype's, if a CollateExpr was const-folded to
-         * a simple constant.
-         */
-        if (OidIsValid(con->constcollid))
-            add_object_address(OCLASS_COLLATION, con->constcollid, 0,
-                               context->addrs);
-
         /*
          * If it's a regclass or similar literal referring to an existing
          * object, add a reference to that object.  (Currently, only the
@@ -2013,10 +1990,6 @@ find_expr_references_walker(Node *node,
         /* A parameter must depend on the parameter's datatype */
         add_object_address(OCLASS_TYPE, param->paramtype, 0,
                            context->addrs);
-        /* and its collation, just as for Consts */
-        if (OidIsValid(param->paramcollid))
-            add_object_address(OCLASS_COLLATION, param->paramcollid, 0,
-                               context->addrs);
     }
     else if (IsA(node, FuncExpr))
     {
@@ -2024,6 +1997,9 @@ find_expr_references_walker(Node *node,

         add_object_address(OCLASS_PROC, funcexpr->funcid, 0,
                            context->addrs);
+        if (OidIsValid(funcexpr->inputcollid))
+            add_object_address(OCLASS_COLLATION, funcexpr->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, OpExpr))
@@ -2032,6 +2008,9 @@ find_expr_references_walker(Node *node,

         add_object_address(OCLASS_OPERATOR, opexpr->opno, 0,
                            context->addrs);
+        if (OidIsValid(opexpr->inputcollid))
+            add_object_address(OCLASS_COLLATION, opexpr->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, DistinctExpr))
@@ -2040,6 +2019,9 @@ find_expr_references_walker(Node *node,

         add_object_address(OCLASS_OPERATOR, distinctexpr->opno, 0,
                            context->addrs);
+        if (OidIsValid(distinctexpr->inputcollid))
+            add_object_address(OCLASS_COLLATION, distinctexpr->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, NullIfExpr))
@@ -2048,6 +2030,9 @@ find_expr_references_walker(Node *node,

         add_object_address(OCLASS_OPERATOR, nullifexpr->opno, 0,
                            context->addrs);
+        if (OidIsValid(nullifexpr->inputcollid))
+            add_object_address(OCLASS_COLLATION, nullifexpr->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, ScalarArrayOpExpr))
@@ -2056,6 +2041,9 @@ find_expr_references_walker(Node *node,

         add_object_address(OCLASS_OPERATOR, opexpr->opno, 0,
                            context->addrs);
+        if (OidIsValid(opexpr->inputcollid))
+            add_object_address(OCLASS_COLLATION, opexpr->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, Aggref))
@@ -2064,6 +2052,9 @@ find_expr_references_walker(Node *node,

         add_object_address(OCLASS_PROC, aggref->aggfnoid, 0,
                            context->addrs);
+        if (OidIsValid(aggref->inputcollid))
+            add_object_address(OCLASS_COLLATION, aggref->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, WindowFunc))
@@ -2072,6 +2063,9 @@ find_expr_references_walker(Node *node,

         add_object_address(OCLASS_PROC, wfunc->winfnoid, 0,
                            context->addrs);
+        if (OidIsValid(wfunc->inputcollid))
+            add_object_address(OCLASS_COLLATION, wfunc->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, SubscriptingRef))
@@ -2116,10 +2110,6 @@ find_expr_references_walker(Node *node,
         else
             add_object_address(OCLASS_TYPE, fselect->resulttype, 0,
                                context->addrs);
-        /* the collation might not be referenced anywhere else, either */
-        if (OidIsValid(fselect->resultcollid))
-            add_object_address(OCLASS_COLLATION, fselect->resultcollid, 0,
-                               context->addrs);
     }
     else if (IsA(node, FieldStore))
     {
@@ -2146,10 +2136,6 @@ find_expr_references_walker(Node *node,
         /* since there is no function dependency, need to depend on type */
         add_object_address(OCLASS_TYPE, relab->resulttype, 0,
                            context->addrs);
-        /* the collation might not be referenced anywhere else, either */
-        if (OidIsValid(relab->resultcollid))
-            add_object_address(OCLASS_COLLATION, relab->resultcollid, 0,
-                               context->addrs);
     }
     else if (IsA(node, CoerceViaIO))
     {
@@ -2158,10 +2144,6 @@ find_expr_references_walker(Node *node,
         /* since there is no exposed function, need to depend on type */
         add_object_address(OCLASS_TYPE, iocoerce->resulttype, 0,
                            context->addrs);
-        /* the collation might not be referenced anywhere else, either */
-        if (OidIsValid(iocoerce->resultcollid))
-            add_object_address(OCLASS_COLLATION, iocoerce->resultcollid, 0,
-                               context->addrs);
     }
     else if (IsA(node, ArrayCoerceExpr))
     {
@@ -2170,10 +2152,6 @@ find_expr_references_walker(Node *node,
         /* as above, depend on type */
         add_object_address(OCLASS_TYPE, acoerce->resulttype, 0,
                            context->addrs);
-        /* the collation might not be referenced anywhere else, either */
-        if (OidIsValid(acoerce->resultcollid))
-            add_object_address(OCLASS_COLLATION, acoerce->resultcollid, 0,
-                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, ConvertRowtypeExpr))
@@ -2213,6 +2191,24 @@ find_expr_references_walker(Node *node,
             add_object_address(OCLASS_OPFAMILY, lfirst_oid(l), 0,
                                context->addrs);
         }
+        foreach(l, rcexpr->inputcollids)
+        {
+            Oid            inputcollid = lfirst_oid(l);
+
+            if (OidIsValid(inputcollid))
+                add_object_address(OCLASS_COLLATION, inputcollid, 0,
+                                   context->addrs);
+        }
+        /* fall through to examine arguments */
+    }
+    else if (IsA(node, MinMaxExpr))
+    {
+        MinMaxExpr *mmexpr = (MinMaxExpr *) node;
+
+        /* minmaxtype will match one of the inputs, so no need to record it */
+        if (OidIsValid(mmexpr->inputcollid))
+            add_object_address(OCLASS_COLLATION, mmexpr->inputcollid, 0,
+                               context->addrs);
         /* fall through to examine arguments */
     }
     else if (IsA(node, CoerceToDomain))
diff --git a/src/backend/catalog/pg_type.c b/src/backend/catalog/pg_type.c
index ec5d122432..22f289d9f8 100644
--- a/src/backend/catalog/pg_type.c
+++ b/src/backend/catalog/pg_type.c
@@ -564,13 +564,19 @@ GetTypeCollations(Oid typeoid)
     }
     else if (typeTup->typtype == TYPTYPE_RANGE)
     {
-        Oid            rangeid = get_range_subtype(typeTup->oid);
+        Oid            rangecoll = get_range_collation(typeTup->oid);

-        Assert(OidIsValid(rangeid));
+        if (OidIsValid(rangecoll))
+            result = list_append_unique_oid(result, rangecoll);
+        else
+        {
+            Oid            rangeid = get_range_subtype(typeTup->oid);

-        result = list_concat_unique_oid(result, GetTypeCollations(rangeid));
+            Assert(OidIsValid(rangeid));
+            result = list_concat_unique_oid(result, GetTypeCollations(rangeid));
+        }
     }
-    else if (OidIsValid(typeTup->typelem))
+    else if (IsTrueArrayType(typeTup))
         result = list_concat_unique_oid(result,
                                         GetTypeCollations(typeTup->typelem));

diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 779405ef32..faf99f76b5 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1958,7 +1958,7 @@ CREATE DOMAIN d_custom AS t_custom;
 CREATE COLLATION custom (
     LOCALE = 'fr-x-icu', PROVIDER = 'icu'
 );
-CREATE TYPE myrange AS range (subtype = text);
+CREATE TYPE myrange AS range (subtype = text, collation = "POSIX");
 CREATE TYPE myrange_en_fr_ga AS range(subtype = t_en_fr_ga);
 CREATE TABLE collate_test (
     id integer,
@@ -2054,36 +2054,17 @@ ORDER BY 1, 2, 3;
  icuidx05_d_en_fr_ga_arr           | "en-x-icu" | up to date
  icuidx05_d_en_fr_ga_arr           | "fr-x-icu" | up to date
  icuidx05_d_en_fr_ga_arr           | "ga-x-icu" | up to date
- icuidx06_d_en_fr_ga               | "default"  | XXX
- icuidx06_d_en_fr_ga               | "en-x-icu" | up to date
  icuidx06_d_en_fr_ga               | "fr-x-icu" | up to date
- icuidx06_d_en_fr_ga               | "ga-x-icu" | up to date
- icuidx07_d_en_fr_ga               | "default"  | XXX
- icuidx07_d_en_fr_ga               | "en-x-icu" | up to date
- icuidx07_d_en_fr_ga               | "fr-x-icu" | up to date
  icuidx07_d_en_fr_ga               | "ga-x-icu" | up to date
- icuidx08_d_en_fr_ga               | "en-x-icu" | up to date
- icuidx08_d_en_fr_ga               | "fr-x-icu" | up to date
- icuidx08_d_en_fr_ga               | "ga-x-icu" | up to date
- icuidx09_d_en_fr_ga               | "en-x-icu" | up to date
- icuidx09_d_en_fr_ga               | "fr-x-icu" | up to date
- icuidx09_d_en_fr_ga               | "ga-x-icu" | up to date
- icuidx10_d_en_fr_ga_es            | "en-x-icu" | up to date
  icuidx10_d_en_fr_ga_es            | "es-x-icu" | up to date
- icuidx10_d_en_fr_ga_es            | "fr-x-icu" | up to date
- icuidx10_d_en_fr_ga_es            | "ga-x-icu" | up to date
- icuidx11_d_es                     | "default"  | XXX
  icuidx11_d_es                     | "es-x-icu" | up to date
- icuidx12_custom                   | "default"  | XXX
  icuidx12_custom                   | custom     | up to date
- icuidx13_custom                   | "default"  | XXX
  icuidx13_custom                   | custom     | up to date
- icuidx14_myrange                  | "default"  | XXX
  icuidx15_myrange_en_fr_ga         | "en-x-icu" | up to date
  icuidx15_myrange_en_fr_ga         | "fr-x-icu" | up to date
  icuidx15_myrange_en_fr_ga         | "ga-x-icu" | up to date
  icuidx17_part                     | "en-x-icu" | up to date
-(57 rows)
+(38 rows)

 -- Validate that REINDEX will update the stored version.
 UPDATE pg_depend SET refobjversion = 'not a version'
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 7f40c56039..4c71f4d249 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -755,7 +755,7 @@ CREATE COLLATION custom (
     LOCALE = 'fr-x-icu', PROVIDER = 'icu'
 );

-CREATE TYPE myrange AS range (subtype = text);
+CREATE TYPE myrange AS range (subtype = text, collation = "POSIX");
 CREATE TYPE myrange_en_fr_ga AS range(subtype = t_en_fr_ga);

 CREATE TABLE collate_test (

Re: Bogus collation version recording in recordMultipleDependencies

From

Thomas Munro

Date:

16 April 2021, 22:01:53

On Sat, Apr 17, 2021 at 8:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Per the changes in collate.icu.utf8.out, this gets rid of
> a lot of imaginary collation dependencies, but it also gets
> rid of some arguably-real ones.  In particular, calls of
> record_eq and its siblings will be considered not to have
> any collation dependencies, although we know that internally
> those will look up per-column collations of their input types.
> We could imagine special-casing record_eq etc here, but that
> sure seems like a hack.

Thanks for looking into all this.  Hmm.

> I"m starting to have a bad feeling about 257836a75 overall.
> As I think I've complained before, I do not like anything about
> what it's done to pg_depend; it's forcing that relation to serve
> two masters, neither one well. ...

We did worry about (essentially) this question quite a bit in the
discussion thread, but we figured that you'd otherwise have to create
a parallel infrastructure that would look almost identical (for
example [1]).

> ...  We now see that the same remark
> applies to find_expr_references(), because the semantics of
> "which collations does this expression's behavior depend on" aren't
> identical to "which collations need to be recorded as direct
> dependencies of this expression", especially not if you'd prefer
> to minimize either list.  (Which is important.) ...

Bugs in the current analyser code aside, if we had a second catalog
and a second analyser for this stuff, then you'd still have the union
of both minimised sets in total, with some extra duplication because
you'd have some rows in both places that are currently handled by one
row, no?

> ... Moreover, for all
> the complexity it's introducing, it's next door to useless for
> glibc collations --- we might as well tell people "reindex
> everything when your glibc version changes", which could be done
> with a heck of a lot less infrastructure. ...

You do gain reliable tracking of which indexes remain to be rebuilt,
and warnings for common hazards like hot standbys with mismatched
glibc, so I think it's pretty useful.  As for the poverty of
information from glibc, I don't see why it should hold ICU, Windows,
FreeBSD users back.  In fact I am rather hoping that by shipping this,
glibc developers will receive encouragement to add the trivial
interface we need to do better.

> ... The situation on Windows
> looks pretty user-unfriendly as well, per the other thread.

That is unfortunate, it seems like such a stupid problem.  Restating
here for the sake of the list:  initdb just needs to figure out how to
ask for the current environment's locale in BCP 47 format ("en-US")
when setting the default for your template databases, not the
traditional format ("English_United States.1252") that Microsoft
explicitly tells us not to store in databases and that doesn't work in
the versioning API, but since we're mostly all Unix hackers we don't
know how.

> So I wonder if, rather than continuing to pursue this right now,
> we shouldn't revert 257836a75 and try again later with a new design
> that doesn't try to commandeer the existing dependency infrastructure.
> We might have a better idea about what to do on Windows by the time
> that's done, too.

It seems to me that there are two things that would be needed to
salvage this for PG14: (1) deciding that we're unlikely to come up
with a better idea than using pg_depend for this (following the
argument that it'd only create duplication to have a parallel
dedicated catalog), (2) fixing any remaining flaws in the dependency
analyser code.  I'll look into the details some more on Monday.

[1] https://www.postgresql.org/message-id/e9e22c5e-c018-f4ea-24c8-5b6d6fdacf30%402ndquadrant.com

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

16 April 2021, 22:47:04

Thomas Munro <thomas.munro@gmail.com> writes:
> I'll look into the details some more on Monday.

Fair enough.

Although there are only a few buildfarm members complaining, I don't
really want to leave them red all weekend.  I could either commit the
patch I just presented, or revert ef387bed8 ... got a preference?

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Thomas Munro

Date:

17 April 2021, 01:34:28

On Sat, Apr 17, 2021 at 10:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > I'll look into the details some more on Monday.
>
> Fair enough.
>
> Although there are only a few buildfarm members complaining, I don't
> really want to leave them red all weekend.  I could either commit the
> patch I just presented, or revert ef387bed8 ... got a preference?

+1 for committing the new patch for now.  I will look into to the
record problem.  More in a couple of days.

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

17 April 2021, 02:24:21

Thomas Munro <thomas.munro@gmail.com> writes:
> On Sat, Apr 17, 2021 at 10:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Although there are only a few buildfarm members complaining, I don't
>> really want to leave them red all weekend.  I could either commit the
>> patch I just presented, or revert ef387bed8 ... got a preference?

> +1 for committing the new patch for now.  I will look into to the
> record problem.  More in a couple of days.

OK, done.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

17 April 2021, 09:23:09

On Fri, Apr 16, 2021 at 01:07:52PM -0400, Tom Lane wrote:
> I wrote:
> > ... or maybe not just yet.  Andres' buildfarm critters seem to have
> > a different opinion than my machine about what the output of
> > collate.icu.utf8 ought to be.  I wonder what the prevailing LANG
> > setting is for them, and which ICU version they're using.
> 
> Oh, I bet it's "C.utf8", because I can reproduce the failure with that.
> This crystallizes a nagging feeling I'd had that you were misdescribing
> the collate.icu.utf8 test as not being run under --no-locale.  Actually,
> it's only skipped if the encoding isn't UTF8, not the same thing.
> I think we need to remove the default-collation cases from that test too.

IIUC pg_regress --no-locale will call initdb --no-locale which force the locale
to C, and in that case pg_get_encoding_from_locale() does force SQL_ASCII as
encoding.  But yes I clearly didn't think at all that you could set the various
env variables to C.utf8 which can then run the collate.icu.utf8 or linux.utf8
:(

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

17 April 2021, 09:24:43

On Fri, Apr 16, 2021 at 10:24:21PM -0400, Tom Lane wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > On Sat, Apr 17, 2021 at 10:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Although there are only a few buildfarm members complaining, I don't
> >> really want to leave them red all weekend.  I could either commit the
> >> patch I just presented, or revert ef387bed8 ... got a preference?
> 
> > +1 for committing the new patch for now.  I will look into to the
> > record problem.  More in a couple of days.
> 
> OK, done.

Thanks for the fixes!  I'll also look at the problem.

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

18 April 2021, 11:23:33

On Sat, Apr 17, 2021 at 10:01:53AM +1200, Thomas Munro wrote:
> On Sat, Apr 17, 2021 at 8:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Per the changes in collate.icu.utf8.out, this gets rid of
> > a lot of imaginary collation dependencies, but it also gets
> > rid of some arguably-real ones.  In particular, calls of
> > record_eq and its siblings will be considered not to have
> > any collation dependencies, although we know that internally
> > those will look up per-column collations of their input types.
> > We could imagine special-casing record_eq etc here, but that
> > sure seems like a hack.
> 
> [...]
> 
> > So I wonder if, rather than continuing to pursue this right now,
> > we shouldn't revert 257836a75 and try again later with a new design
> > that doesn't try to commandeer the existing dependency infrastructure.
> > We might have a better idea about what to do on Windows by the time
> > that's done, too.
> 
> It seems to me that there are two things that would be needed to
> salvage this for PG14: (1) deciding that we're unlikely to come up
> with a better idea than using pg_depend for this (following the
> argument that it'd only create duplication to have a parallel
> dedicated catalog), (2) fixing any remaining flaws in the dependency
> analyser code.  I'll look into the details some more on Monday.

So IIUC the issue here is that the code could previously record useless
collation version dependencies in somes cases, which could lead to false
positive possible corruption messages (and of course additional bloat on
pg_depend).  False positive messages can't be avoided anyway, as a collation
version update may not corrupt the actually indexed set of data, especially for
glibc.  But with the infrastructure as-is advanced user can look into the new
version changes and choose to ignore changes for a specific set of collation,
which is way easier to do with the recorded dependencies.

The new situation is now that the code can record too few version dependencies
leading to false negative detection, which is way more problematic.

This was previously discussed around [1].  Quoting Thomas:

> To state more explicitly what's happening here, we're searching the
> expression trees for subexpresions that have a collation as part of
> their static type.  We don't know which functions or operators are
> actually affected by the collation, though.  For example, if an
> expression says "x IS NOT NULL" and x happens to be a subexpression of
> a type with a particular collation, we don't now that this
> expression's value can't possibly be affected by the collation version
> changing.  So, the system will nag you to rebuild an index just
> because you mentioned it, even though the index can't be corrupted.
> To do better than that, I suppose we'd need declarations in the
> catalog to say which functions/operators are collation sensitive.

We agreed that having possible false positive dependencies was acceptable for
the initial implementation and that we will improve it in later versions, as
otherwise the alternative is to reindex everything without getting any warning,
which clearly isn't better anyway.

FTR was had the same agreement to not handle specific AMs that don't care about
collation (like hash or bloom) in [2], even though I provided a patch to handle
that case ([3]) which was dropped later on ([4]).

Properly and correctly handling collation version dependency in expressions is
a hard problem and will definitely require additional fields in pg_proc, so we
clearly can't add that in pg14.  So yes we have to decide whether we want to
keep the feature in pg14 with the known limitations (and in that case probably
revert f24b15699, possibly improving documentation on the possibility of false
positive) or revert it entirely.

Unsurprisingly, I think that the feature as-is is already a significant
improvement, which can be easily improved, so my vote is to keep it in pg14.
And just to be clear I'm volunteering to work on the expression problem and all
other related improvements for the next version, whether the current feature is
reverted or not.

[1]: https://www.postgresql.org/message-id/CA%2BhUKGK8CwBcTcXWL2kUjpHT%2B6t2hEFCzkcZ-Z7xXbz%3DC4NLCQ%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/13b0c950-80f9-4c10-7e0f-f59feac56a98%402ndquadrant.com
[3]: https://www.postgresql.org/message-id/20200908144507.GA57691%40nol
[4]:
https://www.postgresql.org/message-id/CA%2BhUKGKHj4aYmmwKZdZjkD%3DCWRmn%3De6UsS7S%2Bu6oLrrp0orgsw%40mail.gmail.com

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

18 April 2021, 15:29:42

Julien Rouhaud <rjuju123@gmail.com> writes:
> On Sat, Apr 17, 2021 at 10:01:53AM +1200, Thomas Munro wrote:
>> It seems to me that there are two things that would be needed to
>> salvage this for PG14: (1) deciding that we're unlikely to come up
>> with a better idea than using pg_depend for this (following the
>> argument that it'd only create duplication to have a parallel
>> dedicated catalog), (2) fixing any remaining flaws in the dependency
>> analyser code.  I'll look into the details some more on Monday.

> So IIUC the issue here is that the code could previously record useless
> collation version dependencies in somes cases, ...
> The new situation is now that the code can record too few version dependencies
> leading to false negative detection, which is way more problematic.

I'm not sure that an error in this direction is all that much more
problematic than the other direction.  If it's okay to claim that
indexes need to be rebuilt when they don't really, then we could just
drop this entire overcomplicated infrastructure and report that all
indexes need to be rebuilt after any collation version change.

But in any case you're oversimplifying tremendously.  The previous code is
just as capable of errors of omission, because it was inquiring into the
wrong composite types, ie those of leaf expression nodes.  The ones we'd
need to look at are the immediate inputs of record_eq and siblings.  Here
are a couple of examples where the leaf types are unhelpful:

... where row(a,b,c)::composite_type < row(d,e,f)::composite_type;
... where function_returning_composite(...) < function_returning_composite(...);

And even if we do this, we're not entirely in the clear in an abstract
sense, because this only covers cases in which an immediate input is
of a known named composite type.  Cases dealing in anonymous RECORD
types simply can't be resolved statically.  It might be that that
can't occur in the specific situation of CREATE INDEX expressions,
but I'm not 100% sure of it.  The apparent counterexample of

... where row(a,b) < row(a,c)

isn't one because we parse that as RowCompareExpr not an application
of record_lt.

> We agreed that having possible false positive dependencies was acceptable for
> the initial implementation and that we will improve it in later versions, as
> otherwise the alternative is to reindex everything without getting any warning,
> which clearly isn't better anyway.

[ shrug... ] You have both false positives and false negatives in the
thing as it stood before f24b15699.  I'm not convinced that it's possible
to completely avoid either issue via static analysis.  I'm inclined to
think that false negatives around record_eq-like functions are not such a
problem for real index definitions, and we'd be better off with fewer
false positives.  But it's all judgment calls.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Andres Freund

Date:

19 April 2021, 17:36:32

Hi,

On 2021-04-18 11:29:42 -0400, Tom Lane wrote:
> I'm not sure that an error in this direction is all that much more
> problematic than the other direction.  If it's okay to claim that
> indexes need to be rebuilt when they don't really, then we could just
> drop this entire overcomplicated infrastructure and report that all
> indexes need to be rebuilt after any collation version change.

That doesn't ring true to me. There's a huge difference between needing
to rebuild all indexes, especially primary key indexes which often are
over int8 etc, and unnecessarily needing to rebuild indexes doing
comparatively rare things.

Greetings,

Andres Freund

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

19 April 2021, 17:52:59

Andres Freund <andres@anarazel.de> writes:
> On 2021-04-18 11:29:42 -0400, Tom Lane wrote:
>> I'm not sure that an error in this direction is all that much more
>> problematic than the other direction.  If it's okay to claim that
>> indexes need to be rebuilt when they don't really, then we could just
>> drop this entire overcomplicated infrastructure and report that all
>> indexes need to be rebuilt after any collation version change.

> That doesn't ring true to me. There's a huge difference between needing
> to rebuild all indexes, especially primary key indexes which often are
> over int8 etc, and unnecessarily needing to rebuild indexes doing
> comparatively rare things.

It would not be that hard to exclude indexes on int8, or other cases
that clearly have zero collation dependencies.  And I think I might
have some faith in such a solution.  Right now I have zero faith
that the patch as it stands gives trustworthy answers.

I think that the real fundamental bug is supposing that static analysis
can give 100% correct answers.  Even if it did do so in a given state
of the database, consider this counterexample:

create type myrow as (f1 int, f2 int);
create table mytable (id bigint, r1 myrow, r2 myrow);
create index myindex on mytable(id) where r1 < r2;
alter type myrow add attribute f3 text;

myindex is recorded as having no collation dependency, but that is
now wrong.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Peter Geoghegan

Date:

19 April 2021, 18:04:26

On Mon, Apr 19, 2021 at 10:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think that the real fundamental bug is supposing that static analysis
> can give 100% correct answers.  Even if it did do so in a given state
> of the database, consider this counterexample:
>
> create type myrow as (f1 int, f2 int);
> create table mytable (id bigint, r1 myrow, r2 myrow);
> create index myindex on mytable(id) where r1 < r2;
> alter type myrow add attribute f3 text;
>
> myindex is recorded as having no collation dependency, but that is
> now wrong.

Is it really the case that static analysis of the kind that you'd need
to make this 100% robust is fundamentally impossible? I find that
proposition hard to believe.

I'm not sure that you were making a totally general statement, rather
than a statement about the patch/implementation, so perhaps I just
missed the point.

-- 
Peter Geoghegan

Re: Bogus collation version recording in recordMultipleDependencies

From

Peter Geoghegan

Date:

19 April 2021, 18:13:37

On Sun, Apr 18, 2021 at 4:23 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> So IIUC the issue here is that the code could previously record useless
> collation version dependencies in somes cases, which could lead to false
> positive possible corruption messages (and of course additional bloat on
> pg_depend).  False positive messages can't be avoided anyway, as a collation
> version update may not corrupt the actually indexed set of data, especially for
> glibc.

This argument seems completely absurd to me.

-- 
Peter Geoghegan

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

19 April 2021, 18:49:05

Peter Geoghegan <pg@bowt.ie> writes:
> On Mon, Apr 19, 2021 at 10:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think that the real fundamental bug is supposing that static analysis
>> can give 100% correct answers.

> Is it really the case that static analysis of the kind that you'd need
> to make this 100% robust is fundamentally impossible? I find that
> proposition hard to believe.

I didn't mean to imply that it's necessarily theoretically impossible,
but given our lack of visibility into what a function or operator
will do, plus the way that the collation feature was bolted on
with minimal system-level redesign, it's sure pretty darn hard.
Code like record_eq is doing a lot at runtime that we can't really
see from static analysis.

Anyway, given the ALTER TYPE ADD ATTRIBUTE counterexample, I'm
definitely starting to lean towards "revert and try again in v15".
I feel we'd be best off to consider functions/operators that
operate on container types to be "maybe"s rather than certainly
safe or certainly not safe.  I think that such things appear
sufficiently rarely in index specifications that it's not worth it
to try to do an exact analysis of them, even if we were sure we
could get that 100% right.  But that doesn't seem to be an idea that
can trivially be added to the current design.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Peter Geoghegan

Date:

19 April 2021, 19:38:27

On Mon, Apr 19, 2021 at 11:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I didn't mean to imply that it's necessarily theoretically impossible,
> but given our lack of visibility into what a function or operator
> will do, plus the way that the collation feature was bolted on
> with minimal system-level redesign, it's sure pretty darn hard.
> Code like record_eq is doing a lot at runtime that we can't really
> see from static analysis.

It's worth pointing out that code like record_eq is not (or at least
should not be) fundamentally unpredictable and unruly. The fact that
record_eq does typecache lookups and whatnot seems to me to be an
implementation detail. What record_eq is entitled to assume about
collations could be formalized by some general high-level
specification. It ought to be possible to do this, just as it ought to
be possible for us to statically determine if a composite type is safe
to use with B-Tree deduplication.

Whether or not it's worth the trouble is another matter, but it might
be if a single effort solved a bunch of related problems, not just the
collation dependency problem.

> Anyway, given the ALTER TYPE ADD ATTRIBUTE counterexample, I'm
> definitely starting to lean towards "revert and try again in v15".

The counterexample concerns me because it seems to indicate a lack of
sophistication in how dependencies are managed with corner cases -- I
don't think that it's okay to leave the behavior unspecified in a
stable release. But I also think that we should consider if code like
record_eq is in fact the real problem (or just the lack of any general
specification that constrains code like it in useful ways, perhaps).
This probably won't affect whether or not the patch gets reverted now,
but it still matters.

-- 
Peter Geoghegan

Re: Bogus collation version recording in recordMultipleDependencies

From

Thomas Munro

Date:

19 April 2021, 19:42:42

On Tue, Apr 20, 2021 at 5:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think that the real fundamental bug is supposing that static analysis
> can give 100% correct answers.  ...

Well, the goal was to perform analysis to the extent possible
statically since that would cover the vast majority of cases and is
practically all you can do.  Clearly there is always going to be a
category of invisible dependencies inside procedural code in general
(halting problem).  We did think about the idea of using new
declarations about functions/operators to know which ones actually
care about collation, rather than assuming that they all do (bugs
aside), as an optimisation, and then that mechanism could in theory
also be used to say that functions that don't appear to depend on
collations actually do internal, but that all seemed like vast
overkill, so we left it for possible later improvements.  The question
on my mind is whether reverting the feature and trying again for 15
could produce anything fundamentally better at a design level, or
would just fix problems in the analyser code that we could fix right
now.  For example, if you think there actually is a potential better
plan than using pg_depend for this, that'd definitely be good to know
about.

> ... Even if it did do so in a given state
> of the database, consider this counterexample:
>
> create type myrow as (f1 int, f2 int);
> create table mytable (id bigint, r1 myrow, r2 myrow);
> create index myindex on mytable(id) where r1 < r2;
> alter type myrow add attribute f3 text;
>
> myindex is recorded as having no collation dependency, but that is
> now wrong.

Hrmph.  Yeah.  We didn't consider types that change later like this,
and handling those correctly does seem to warrant some more thought
and work than we perhaps have time for.  My first thought is that we'd
need to teach it to trigger reanalysis.

Re: Bogus collation version recording in recordMultipleDependencies

From

Tom Lane

Date:

19 April 2021, 20:21:17

Thomas Munro <thomas.munro@gmail.com> writes:
> ... The question
> on my mind is whether reverting the feature and trying again for 15
> could produce anything fundamentally better at a design level, or
> would just fix problems in the analyser code that we could fix right
> now.

Well, as I said, I think what we ought to do is treat any record-accepting
functions/operators as "don't know, better assume it's collation
dependent".  What's not clear to me is how that concept could be
shoehorned into the existing design.

> For example, if you think there actually is a potential better
> plan than using pg_depend for this, that'd definitely be good to know
> about.

I really dislike using pg_depend, for a couple of reasons:

* You've broken the invariant that dependencies on pinned objects
are never recorded.  Now, some of them exist, for reasons having
nothing to do with the primary goals of pg_depend.  If that's not
a sign of bad relational design, I don't know what is.  I didn't
look at the code, but I wonder if you didn't have to lobotomize
some error checks in dependency.c because of that.  (Perhaps
some sort of special-case representation for the default
collation would help here?)

* pg_depend used to always be all-not-null.  Now, most rows in it
will need a nulls bitmap, adding 8 bytes per row (on maxalign=8
hardware) to what had been fairly narrow rows.  By my arithmetic
that's 13.3% bloat in what is already one of our largest
catalogs.  That's quite unpleasant.  (It would actually be
cheaper to store an empty-string refobjversion for non-collation
entries; a single-byte string would fit into the pad space
after deptype, adding nothing to the row width.)

> Hrmph.  Yeah.  We didn't consider types that change later like this,
> and handling those correctly does seem to warrant some more thought
> and work than we perhaps have time for.  My first thought is that we'd
> need to teach it to trigger reanalysis.

That seems like a nonstarter, even before you think about race
conditions.

            regards, tom lane

Re: Bogus collation version recording in recordMultipleDependencies

From

Thomas Munro

Date:

20 April 2021, 00:05:27

On Tue, Apr 20, 2021 at 8:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:>
> Thomas Munro <thomas.munro@gmail.com> writes:
> > For example, if you think there actually is a potential better
> > plan than using pg_depend for this, that'd definitely be good to know
> > about.
>
> I really dislike using pg_depend, for a couple of reasons:
>
> * You've broken the invariant that dependencies on pinned objects
> are never recorded.  Now, some of them exist, for reasons having
> nothing to do with the primary goals of pg_depend.  If that's not
> a sign of bad relational design, I don't know what is.  I didn't
> look at the code, but I wonder if you didn't have to lobotomize
> some error checks in dependency.c because of that.  (Perhaps
> some sort of special-case representation for the default
> collation would help here?)

Hmm, OK, thanks, that's something to go back and think about.

> * pg_depend used to always be all-not-null.  Now, most rows in it
> will need a nulls bitmap, adding 8 bytes per row (on maxalign=8
> hardware) to what had been fairly narrow rows.  By my arithmetic
> that's 13.3% bloat in what is already one of our largest
> catalogs.  That's quite unpleasant.  (It would actually be
> cheaper to store an empty-string refobjversion for non-collation
> entries; a single-byte string would fit into the pad space
> after deptype, adding nothing to the row width.)

That seems like a good idea.

> > Hrmph.  Yeah.  We didn't consider types that change later like this,
> > and handling those correctly does seem to warrant some more thought
> > and work than we perhaps have time for.  My first thought is that we'd
> > need to teach it to trigger reanalysis.
>
> That seems like a nonstarter, even before you think about race
> conditions.

Yeah, that runs directly into non-trivial locking problems.  I felt
like some of the other complaints could conceivably be addressed in
time, including dumb stuff like Windows default locale string format
and hopefully some expression analysis problems, but not this.  I'll
hold off reverting for a few more days to see if anyone has any other
thoughts on that, because there doesn't seem to be any advantage in
being too hasty about it.

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

20 April 2021, 01:46:07

On Mon, Apr 19, 2021 at 11:13:37AM -0700, Peter Geoghegan wrote:
> On Sun, Apr 18, 2021 at 4:23 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > So IIUC the issue here is that the code could previously record useless
> > collation version dependencies in somes cases, which could lead to false
> > positive possible corruption messages (and of course additional bloat on
> > pg_depend).  False positive messages can't be avoided anyway, as a collation
> > version update may not corrupt the actually indexed set of data, especially for
> > glibc.
> 
> This argument seems completely absurd to me.

I'm not sure why?  For glibc at least, I don't see how we could not end up
raising false positive as you have a single glibc version for all its
collations.  If a user has say en_US and fr_FR, or any quite stable collation,
most of the glibc upgrades (except 2.28 of course) won't corrupt your indexes.

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

20 April 2021, 01:49:08

On Tue, Apr 20, 2021 at 12:05:27PM +1200, Thomas Munro wrote:
> 
> Yeah, that runs directly into non-trivial locking problems.  I felt
> like some of the other complaints could conceivably be addressed in
> time, including dumb stuff like Windows default locale string format
> and hopefully some expression analysis problems, but not this.  I'll
> hold off reverting for a few more days to see if anyone has any other
> thoughts on that, because there doesn't seem to be any advantage in
> being too hasty about it.

I also feel that the ALTER TYPE example Tom showed earlier isn't something
trivial to fix and cannot be done in pg14 :(

Re: Bogus collation version recording in recordMultipleDependencies

From

Peter Geoghegan

Date:

20 April 2021, 02:27:24

On Mon, Apr 19, 2021 at 6:45 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > This argument seems completely absurd to me.
>
> I'm not sure why?  For glibc at least, I don't see how we could not end up
> raising false positive as you have a single glibc version for all its
> collations.  If a user has say en_US and fr_FR, or any quite stable collation,
> most of the glibc upgrades (except 2.28 of course) won't corrupt your indexes.

If the versions differ and your index happens to not be corrupt
because it just so happened to not depend on any of the rules that
have changed, then a complaint about the collation versions changing
is not what I'd call a false positive. You can call it that if you
want, I suppose -- it's just a question of semantics. But I don't
think you should conflate two very different things. You seem to be
suggesting that they're equivalent just because you can refer to both
of them using the same term.

It's obvious that you could have an absence of index corruption even
in the presence of a collation incompatibility. Especially when there
is only 1 tuple in the index, say -- obviously the core idea is to
manage the dependency on versioned collations, which isn't magic. Do
you really think that's equivalent to having incorrect version
dependencies?

-- 
Peter Geoghegan

Re: Bogus collation version recording in recordMultipleDependencies

From

Julien Rouhaud

Date:

20 April 2021, 03:02:31

On Mon, Apr 19, 2021 at 07:27:24PM -0700, Peter Geoghegan wrote:
> On Mon, Apr 19, 2021 at 6:45 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > > This argument seems completely absurd to me.
> >
> > I'm not sure why?  For glibc at least, I don't see how we could not end up
> > raising false positive as you have a single glibc version for all its
> > collations.  If a user has say en_US and fr_FR, or any quite stable collation,
> > most of the glibc upgrades (except 2.28 of course) won't corrupt your indexes.
> 
> If the versions differ and your index happens to not be corrupt
> because it just so happened to not depend on any of the rules that
> have changed, then a complaint about the collation versions changing
> is not what I'd call a false positive. You can call it that if you
> want, I suppose -- it's just a question of semantics. But I don't
> think you should conflate two very different things. You seem to be
> suggesting that they're equivalent just because you can refer to both
> of them using the same term.
> 
> It's obvious that you could have an absence of index corruption even
> in the presence of a collation incompatibility. Especially when there
> is only 1 tuple in the index, say 

Yes, and technically you could still have corruption on indexes containing 1 or
even 0 rows in case of collation provider upgrade, eg if you have a WHERE
clause on the index that does depend on a collation.

> -- obviously the core idea is to
> manage the dependency on versioned collations, which isn't magic. Do
> you really think that's equivalent to having incorrect version
> dependencies?

No I don't think that's equivalent.  What I wanted to say that it's impossible
to raise a WARNING only if the index can really be corrupted (corner cases like
empty tables or similar apart) for instance because of how glibc report
versions, so raising WARNING in some limited corner cases that definitely can't
be corrupted (like because the index expression itself doesn't depend on the
ordering), which clearly isn't the same thing,  was in my opinion an acceptable
trade-off in a first version.  Sorry if that was (or still is) poorly worded.

In any case it was proven that the current approach has way bigger deficiencies
so it's probably not relevant anymore.

Re: Bogus collation version recording in recordMultipleDependencies

From

Thomas Munro

Date:

20 April 2021, 03:07:16

On Tue, Apr 20, 2021 at 1:48 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> On Tue, Apr 20, 2021 at 12:05:27PM +1200, Thomas Munro wrote:
> > Yeah, that runs directly into non-trivial locking problems.  I felt
> > like some of the other complaints could conceivably be addressed in
> > time, including dumb stuff like Windows default locale string format
> > and hopefully some expression analysis problems, but not this.  I'll
> > hold off reverting for a few more days to see if anyone has any other
> > thoughts on that, because there doesn't seem to be any advantage in
> > being too hasty about it.
>
> I also feel that the ALTER TYPE example Tom showed earlier isn't something
> trivial to fix and cannot be done in pg14 :(

Just an idea:  It might be possible to come up with a scheme where
ALTER TYPE ADD ATTRIBUTE records versions somewhere at column add
time, and index_check_collation_versions() finds and checks those when
they aren't superseded by index->collation versions created by
REINDEX, or already present due to other dependencies on the same
collation.  Of course, the opposite problem applies when you ALTER
TYPE DROP ATTRIBUTE: you might have some zombie refobjversions you
don't need anymore, but that would seem to be the least of your
worries if you drop attributes from composite types used in indexes:

create type myrow as (f1 int, f2 int);
create table mytable (r1 myrow primary key);
insert into mytable
select row(generate_series(1, 10), generate_series(10, 1, -1))::myrow;
select * from mytable;
alter type myrow drop attribute f1;
select * from mytable;
select * from mytable where r1 = row(6); -- !!!
reindex table mytable;
select * from mytable where r1 = row(6);

Re: Bogus collation version recording in recordMultipleDependencies

From

Andres Freund

Date:

21 April 2021, 20:28:08

Hi,

On 2021-04-20 12:05:27 +1200, Thomas Munro wrote:
> I'll hold off reverting for a few more days to see if anyone has any
> other thoughts on that, because there doesn't seem to be any advantage
> in being too hasty about it.

I'm not really convinced that this is warranted, and that it isn't
better addressed by reducing the scope of the feature:

When using index collation versions to decide whether to reindex
individual indexes it is important to not have any false negatives -
otherwise the feature could trigger corruption.

However, the feature has a second, IMO more crucial, aspect: Preventing
silent corruption due to collation changes. There are regular reports of
people corrupting their indexes (and subsequently constraints) due to
collation changes (or collation differences between primary/replica).
To be effective detecting such cases it is not required to catch 100% of
all dangerous cases, just that a high fraction of cases is caught.

And handling the composite type case doesn't seem like it'd impact the
percentage of detected collation issues all that much. For one, indexes
on composite types aren't all that common, and additing new columns to
those composite types is likely even rarer. For another, I'd expect that
nearly all databases that have indexes on composite types also have
indexes on non-composite text columns - which'd be likely to catch the
issue.

Given that this is a regularly occurring source of corruption for users,
and not one just negligent operators run into (we want people to upgrade
OS versions), I think we ought to factor that into our decision what to
do.

Greetings,

Andres Freund

Re: Bogus collation version recording in recordMultipleDependencies

From

Andrew Dunstan

Date:

05 May 2021, 20:58:08

On 4/21/21 4:28 PM, Andres Freund wrote:
> Hi,
>
> On 2021-04-20 12:05:27 +1200, Thomas Munro wrote:
>> I'll hold off reverting for a few more days to see if anyone has any
>> other thoughts on that, because there doesn't seem to be any advantage
>> in being too hasty about it.
> I'm not really convinced that this is warranted, and that it isn't
> better addressed by reducing the scope of the feature:
>
> When using index collation versions to decide whether to reindex
> individual indexes it is important to not have any false negatives -
> otherwise the feature could trigger corruption.
>
> However, the feature has a second, IMO more crucial, aspect: Preventing
> silent corruption due to collation changes. There are regular reports of
> people corrupting their indexes (and subsequently constraints) due to
> collation changes (or collation differences between primary/replica).
> To be effective detecting such cases it is not required to catch 100% of
> all dangerous cases, just that a high fraction of cases is caught.
>
> And handling the composite type case doesn't seem like it'd impact the
> percentage of detected collation issues all that much. For one, indexes
> on composite types aren't all that common, and additing new columns to
> those composite types is likely even rarer. For another, I'd expect that
> nearly all databases that have indexes on composite types also have
> indexes on non-composite text columns - which'd be likely to catch the
> issue.
>
> Given that this is a regularly occurring source of corruption for users,
> and not one just negligent operators run into (we want people to upgrade
> OS versions), I think we ought to factor that into our decision what to
> do.
>


Hi,


this is an open item for release 14 . The discussion seems to have gone
silent for a couple of weeks. Are we in a position to make any
decisions? I hear what Andres says, but is anyone acting on it?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Bogus collation version recording in recordMultipleDependencies

From

Thomas Munro

Date:

05 May 2021, 21:12:18

On Thu, May 6, 2021 at 8:58 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> this is an open item for release 14 . The discussion seems to have gone
> silent for a couple of weeks. Are we in a position to make any
> decisions? I hear what Andres says, but is anyone acting on it?

I'm going to revert this and resubmit for 15.  That'll give proper
time to reconsider the question of whether pg_depend is right for
this, and come up with a non-rushed response to the composite type
problem etc.

Re: Bogus collation version recording in recordMultipleDependencies

From

Andrew Dunstan

Date:

05 May 2021, 21:23:16

On 5/5/21 5:12 PM, Thomas Munro wrote:
> On Thu, May 6, 2021 at 8:58 AM Andrew Dunstan <andrew@dunslane.net> wrote:
>> this is an open item for release 14 . The discussion seems to have gone
>> silent for a couple of weeks. Are we in a position to make any
>> decisions? I hear what Andres says, but is anyone acting on it?
> I'm going to revert this and resubmit for 15.  That'll give proper
> time to reconsider the question of whether pg_depend is right for
> this, and come up with a non-rushed response to the composite type
> problem etc.


OK, thanks.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Bogus collation version recording in recordMultipleDependencies

From

Thomas Munro

Date:

07 May 2021, 10:01:40

On Thu, May 6, 2021 at 9:23 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> On 5/5/21 5:12 PM, Thomas Munro wrote:
> > On Thu, May 6, 2021 at 8:58 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> >> this is an open item for release 14 . The discussion seems to have gone
> >> silent for a couple of weeks. Are we in a position to make any
> >> decisions? I hear what Andres says, but is anyone acting on it?
> > I'm going to revert this and resubmit for 15.  That'll give proper
> > time to reconsider the question of whether pg_depend is right for
> > this, and come up with a non-rushed response to the composite type
> > problem etc.
>
> OK, thanks.

Reverted.  Rebasing notes:

1.  Commit b4c9695e moved toast table declarations so I adapted to the
new scheme, but commit 0cc99327 had taken the OIDs that pg_collation
was previously using, so I had to pick some new ones from the
temporary range for later reassignment.

2.  It took me quite a while to figure out that the collversion column
now needs BKI_DEFAULT(_null_), or the perl script wouldn't accept the
contents of pg_collation.dat.

3.  In a separate commit, I rescued a few sentences of text from the
documentation about libc collation versions and reinstated them in the
most obvious place, because although the per-index tracking has been
reverted, the per-collation version tracking (limited as it is) is now
back and works on more OSes than before.