Re: [HACKERS] sketchy partcollation handling - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [HACKERS] sketchy partcollation handling
Date
Msg-id 08b96204-2eea-d083-b6b4-eaf184e1f059@lab.ntt.co.jp
Whole thread Raw
In response to [HACKERS] sketchy partcollation handling  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] sketchy partcollation handling
List pgsql-hackers
On 2017/06/03 1:31, Robert Haas wrote:
> If you create a partitioned table in the obvious way, partcollation ends up 0:
> 
> rhaas=# create table foo (a int, b text) partition by list (a);
> CREATE TABLE
> rhaas=# select * from pg_partitioned_table;
>  partrelid | partstrat | partnatts | partattrs | partclass |
> partcollation | partexprs
> -----------+-----------+-----------+-----------+-----------+---------------+-----------
>      16420 | l         |         1 | 1         | 1978      | 0             |
> (1 row)
> 
> You could argue that 0 is an OK value there; offhand, I'm not sure
> about that.

If you create index on an integer column, you'll get a 0 in indcollation:

select indcollation from pg_index where indrelid = 'foo'::regclass;
 indcollation
--------------
 0
(1 row)


> But there's nothing in
> https://www.postgresql.org/docs/10/static/catalog-pg-partitioned-table.html
> which indicates that some entries can be 0 rather than a valid
> collation OID.

Yeah, it might be better to explain that.  BTW, pg_index.indcollation's
description lacks a note about this too, so maybe, we should fix both.
OTOH, pg_attribute.attcollation's description mentions it:

       The defined collation of the column, or zero if the column is
       not of a collatable data type.

It might be a good idea to update partcollation's and indcollation's
description along the same lines.

> And this is definitely not OK:
> 
> rhaas=# select * from pg_depend where objid = 16420;
>  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> ---------+-------+----------+------------+----------+-------------+---------
>     1259 | 16420 |        0 |       2615 |     2200 |           0 | n
>     1259 | 16420 |        0 |       3456 |        0 |           0 | n
> (2 rows)
> 
> We shouldn't be storing a dependency on non-existing collation 0.
>
> I'm not sure whether the bug here is that we should have a valid
> collation OID there rather than 0, or whether the bug is that we
> shouldn't be recording a dependency on anything other than a real
> collation OID, but something about this is definitely not right.

I think we can call it a bug of StorePartitionKey().  I looked at the
similar code in index_create() (which actually I had originally looked at
for reference when writing the partitioning code in question) and looks
like it doesn't store the dependency for collation 0 and for the default
collation of the database.  I think the partitioning code should do the
same.  Attached find a patch for the same (which also updates the
documentation as mentioned above); with the patch:

create table p (a int) partition by range (a);
select partcollation from pg_partitioned_table;
 partcollation
---------------
 0
(1 row)


-- no collation dependency stored for invalid collation
select * from pg_depend where objid = 'p'::regclass;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    1259 | 16434 |        0 |       2615 |     2200 |           0 | n
(1 row)


create table p (a text) partition by range (a);
select partcollation from pg_partitioned_table;
 partcollation
---------------
 100
(1 row)

-- no collation dependency stored for the default collation
select * from pg_depend where objid = 'p'::regclass;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    1259 | 16407 |        0 |       2615 |     2200 |           0 | n
(1 row)


create table p (a text) partition by range (a collate "en_US");
select partcollation from pg_partitioned_table;
 partcollation
---------------
 12513
(1 row)

-- dependency on non-default collation is stored
select * from pg_depend where objid = 'p'::regclass;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    1259 | 16410 |        0 |       2615 |     2200 |           0 | n
    1259 | 16410 |        0 |       3456 |    12513 |           0 | n
(2 rows)


BTW, the places which check whether the collation to store a dependency
for is the database default collation don't need to do that.  I mean the
following code block in all of these places:

       /* The default collation is pinned, so don't bother recording it */
       if (OidIsValid(attr->attcollation) &&
           attr->attcollation != DEFAULT_COLLATION_OID)
       {
           referenced.classId = CollationRelationId;
           referenced.objectId = attr->attcollation;
           referenced.objectSubId = 0;
           recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
       }

That's because the default collation is pinned and the dependency code
checks isObjectPinned() and does not create pg_depend entries if so.
Those places are:

AddNewAttributeTuples
StorePartitionKey
index_create
GenerateTypeDependencies
add_column_collation_dependency

Removing that check still passes `make check-world`.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: [HACKERS] logical replication - still unstable after all thesemonths
Next
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] logical replication - still unstable after all thesemonths