Thread: [HACKERS] sketchy partcollation handling

[HACKERS] sketchy partcollation handling

From
Robert Haas
Date:
If you create a partitioned table in the obvious way, partcollation ends up 0:

rhaas=# create table foo (a int, b text) partition by list (a);
CREATE TABLE
rhaas=# select * from pg_partitioned_table;partrelid | partstrat | partnatts | partattrs | partclass |
partcollation | partexprs
-----------+-----------+-----------+-----------+-----------+---------------+-----------    16420 | l         |
1| 1         | 1978      | 0             |
 
(1 row)

You could argue that 0 is an OK value there; offhand, I'm not sure
about that.  But there's nothing in
https://www.postgresql.org/docs/10/static/catalog-pg-partitioned-table.html
which indicates that some entries can be 0 rather than a valid
collation OID.  And this is definitely not OK:

rhaas=# select * from pg_depend where objid = 16420;classid | objid | objsubid | refclassid | refobjid | refobjsubid |
deptype
---------+-------+----------+------------+----------+-------------+---------   1259 | 16420 |        0 |       2615 |
 2200 |           0 | n   1259 | 16420 |        0 |       3456 |        0 |           0 | n
 
(2 rows)

We shouldn't be storing a dependency on non-existing collation 0.

I'm not sure whether the bug here is that we should have a valid
collation OID there rather than 0, or whether the bug is that we
shouldn't be recording a dependency on anything other than a real
collation OID, but something about this is definitely not right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] sketchy partcollation handling

From
Amit Langote
Date:
On 2017/06/03 1:31, Robert Haas wrote:
> If you create a partitioned table in the obvious way, partcollation ends up 0:
> 
> rhaas=# create table foo (a int, b text) partition by list (a);
> CREATE TABLE
> rhaas=# select * from pg_partitioned_table;
>  partrelid | partstrat | partnatts | partattrs | partclass |
> partcollation | partexprs
> -----------+-----------+-----------+-----------+-----------+---------------+-----------
>      16420 | l         |         1 | 1         | 1978      | 0             |
> (1 row)
> 
> You could argue that 0 is an OK value there; offhand, I'm not sure
> about that.

If you create index on an integer column, you'll get a 0 in indcollation:

select indcollation from pg_index where indrelid = 'foo'::regclass;
 indcollation
--------------
 0
(1 row)


> But there's nothing in
> https://www.postgresql.org/docs/10/static/catalog-pg-partitioned-table.html
> which indicates that some entries can be 0 rather than a valid
> collation OID.

Yeah, it might be better to explain that.  BTW, pg_index.indcollation's
description lacks a note about this too, so maybe, we should fix both.
OTOH, pg_attribute.attcollation's description mentions it:

       The defined collation of the column, or zero if the column is
       not of a collatable data type.

It might be a good idea to update partcollation's and indcollation's
description along the same lines.

> And this is definitely not OK:
> 
> rhaas=# select * from pg_depend where objid = 16420;
>  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> ---------+-------+----------+------------+----------+-------------+---------
>     1259 | 16420 |        0 |       2615 |     2200 |           0 | n
>     1259 | 16420 |        0 |       3456 |        0 |           0 | n
> (2 rows)
> 
> We shouldn't be storing a dependency on non-existing collation 0.
>
> I'm not sure whether the bug here is that we should have a valid
> collation OID there rather than 0, or whether the bug is that we
> shouldn't be recording a dependency on anything other than a real
> collation OID, but something about this is definitely not right.

I think we can call it a bug of StorePartitionKey().  I looked at the
similar code in index_create() (which actually I had originally looked at
for reference when writing the partitioning code in question) and looks
like it doesn't store the dependency for collation 0 and for the default
collation of the database.  I think the partitioning code should do the
same.  Attached find a patch for the same (which also updates the
documentation as mentioned above); with the patch:

create table p (a int) partition by range (a);
select partcollation from pg_partitioned_table;
 partcollation
---------------
 0
(1 row)


-- no collation dependency stored for invalid collation
select * from pg_depend where objid = 'p'::regclass;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    1259 | 16434 |        0 |       2615 |     2200 |           0 | n
(1 row)


create table p (a text) partition by range (a);
select partcollation from pg_partitioned_table;
 partcollation
---------------
 100
(1 row)

-- no collation dependency stored for the default collation
select * from pg_depend where objid = 'p'::regclass;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    1259 | 16407 |        0 |       2615 |     2200 |           0 | n
(1 row)


create table p (a text) partition by range (a collate "en_US");
select partcollation from pg_partitioned_table;
 partcollation
---------------
 12513
(1 row)

-- dependency on non-default collation is stored
select * from pg_depend where objid = 'p'::regclass;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    1259 | 16410 |        0 |       2615 |     2200 |           0 | n
    1259 | 16410 |        0 |       3456 |    12513 |           0 | n
(2 rows)


BTW, the places which check whether the collation to store a dependency
for is the database default collation don't need to do that.  I mean the
following code block in all of these places:

       /* The default collation is pinned, so don't bother recording it */
       if (OidIsValid(attr->attcollation) &&
           attr->attcollation != DEFAULT_COLLATION_OID)
       {
           referenced.classId = CollationRelationId;
           referenced.objectId = attr->attcollation;
           referenced.objectSubId = 0;
           recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
       }

That's because the default collation is pinned and the dependency code
checks isObjectPinned() and does not create pg_depend entries if so.
Those places are:

AddNewAttributeTuples
StorePartitionKey
index_create
GenerateTypeDependencies
add_column_collation_dependency

Removing that check still passes `make check-world`.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] sketchy partcollation handling

From
Robert Haas
Date:
On Sun, Jun 4, 2017 at 10:18 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> I think we can call it a bug of StorePartitionKey().  I looked at the
> similar code in index_create() (which actually I had originally looked at
> for reference when writing the partitioning code in question) and looks
> like it doesn't store the dependency for collation 0 and for the default
> collation of the database.  I think the partitioning code should do the
> same.  Attached find a patch for the same (which also updates the
> documentation as mentioned above); with the patch:

Thanks.  Committed.

> BTW, the places which check whether the collation to store a dependency
> for is the database default collation don't need to do that.  I mean the
> following code block in all of these places:
>
>        /* The default collation is pinned, so don't bother recording it */
>        if (OidIsValid(attr->attcollation) &&
>            attr->attcollation != DEFAULT_COLLATION_OID)
>        {
>            referenced.classId = CollationRelationId;
>            referenced.objectId = attr->attcollation;
>            referenced.objectSubId = 0;
>            recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
>        }
>
> That's because the default collation is pinned and the dependency code
> checks isObjectPinned() and does not create pg_depend entries if so.
> Those places are:
>
> AddNewAttributeTuples
> StorePartitionKey
> index_create
> GenerateTypeDependencies
> add_column_collation_dependency

We could go change them all, but I guess I don't particularly see the point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] sketchy partcollation handling

From
Kevin Hale Boyes
Date:


On 6 June 2017 at 09:19, Robert Haas <robertmhaas@gmail.com> wrote:

Thanks.  Committed.

The changes to catalogs.sgml has introduced a double "the" in this part of the sentence "this contains the OID of the the collation".
The other section already had the double "the".

Re: [HACKERS] sketchy partcollation handling

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Jun 4, 2017 at 10:18 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> BTW, the places which check whether the collation to store a dependency
>> for is the database default collation don't need to do that.  I mean the
>> following code block in all of these places:
>> 
>> /* The default collation is pinned, so don't bother recording it */
>> if (OidIsValid(attr->attcollation) &&
>> attr->attcollation != DEFAULT_COLLATION_OID)

> We could go change them all, but I guess I don't particularly see the point.

That's an intentional measure to save the catalog activity involved in
finding out that the default collation is pinned.  It's not *necessary*,
sure, but it's a useful and easy optimization.
        regards, tom lane



Re: [HACKERS] sketchy partcollation handling

From
Robert Haas
Date:
On Tue, Jun 6, 2017 at 11:37 AM, Kevin Hale Boyes <kcboyes@gmail.com> wrote:
> On 6 June 2017 at 09:19, Robert Haas <robertmhaas@gmail.com> wrote:
>> Thanks.  Committed.
>
> The changes to catalogs.sgml has introduced a double "the" in this part of
> the sentence "this contains the OID of the the collation".
> The other section already had the double "the".

Well, that's not a good thing for the
the patch to have done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] sketchy partcollation handling

From
Robert Haas
Date:
On Tue, Jun 6, 2017 at 11:37 AM, Kevin Hale Boyes <kcboyes@gmail.com> wrote:
> On 6 June 2017 at 09:19, Robert Haas <robertmhaas@gmail.com> wrote:
>> Thanks.  Committed.
>
> The changes to catalogs.sgml has introduced a double "the" in this part of
> the sentence "this contains the OID of the the collation".
> The other section already had the double "the".

Uggh!  It was just pointed out to me off-list that I credited the
wrong Kevin in the commit message for this fix.  My apologies.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] sketchy partcollation handling

From
Amit Langote
Date:
On 2017/06/07 1:08, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, Jun 4, 2017 at 10:18 PM, Amit Langote
>> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> BTW, the places which check whether the collation to store a dependency
>>> for is the database default collation don't need to do that.  I mean the
>>> following code block in all of these places:
>>>
>>> /* The default collation is pinned, so don't bother recording it */
>>> if (OidIsValid(attr->attcollation) &&
>>> attr->attcollation != DEFAULT_COLLATION_OID)
> 
>> We could go change them all, but I guess I don't particularly see the point.
> 
> That's an intentional measure to save the catalog activity involved in
> finding out that the default collation is pinned.  It's not *necessary*,
> sure, but it's a useful and easy optimization.

I see.  Thanks for explaining.

Regards,
Amit




Re: [HACKERS] sketchy partcollation handling

From
Amit Langote
Date:
On 2017/06/07 0:19, Robert Haas wrote:
> On Sun, Jun 4, 2017 at 10:18 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> I think we can call it a bug of StorePartitionKey().  I looked at the
>> similar code in index_create() (which actually I had originally looked at
>> for reference when writing the partitioning code in question) and looks
>> like it doesn't store the dependency for collation 0 and for the default
>> collation of the database.  I think the partitioning code should do the
>> same.  Attached find a patch for the same (which also updates the
>> documentation as mentioned above); with the patch:
> 
> Thanks.  Committed.

Thank you.

Regards,
Amit