Thread: [HACKERS] pg_class.relpartbound definition overly brittle

[HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

01 June 2017, 01:40:22

Hackers,

recent changes have introduced the :location field to the partboundspec
in pg_catalog.pg_class.  This means that if two identical tables with
identical partitioning scheme are created, but one is done before a change
to gram.y, and the other after a change to gram.y, the relpartbound fields
for those two tables could show up as different.

Can we perhaps remove the :location field here?  If not, could somebody
please defend why this belongs in the catalog entries for the table?  Sorry
if I am missing something obvious...

Thanks,

Mark Dilger

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Robert Haas

Date:

01 June 2017, 01:50:39

On Wed, May 31, 2017 at 3:40 PM, Mark Dilger <hornschnorter@gmail.com> wrote:
> recent changes have introduced the :location field to the partboundspec
> in pg_catalog.pg_class.  This means that if two identical tables with
> identical partitioning scheme are created, but one is done before a change
> to gram.y, and the other after a change to gram.y, the relpartbound fields
> for those two tables could show up as different.
>
> Can we perhaps remove the :location field here?  If not, could somebody
> please defend why this belongs in the catalog entries for the table?  Sorry
> if I am missing something obvious...

Yeah, that probably wasn't a good decision, although calling a
decision might be giving it too much credit.  I think the easiest
thing to do here would be to change transformPartitionBound() to set
result_spec->location to -1, although maybe a better idea would be to
have two different structures -- one that represents the partition
bound specification *before* parse analysis and another that
represents it *after* parse analysis, rather than reusing the same
structure for both.  Then again, maybe making two different node types
for this is overkill.  Not sure.  Opinions?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Alvaro Herrera

Date:

01 June 2017, 03:49:38

Mark Dilger wrote:
> Hackers,
> 
> recent changes have introduced the :location field to the partboundspec
> in pg_catalog.pg_class.  This means that if two identical tables with
> identical partitioning scheme are created, but one is done before a change
> to gram.y, and the other after a change to gram.y, the relpartbound fields
> for those two tables could show up as different.

Actually, if you look at equalfuncs.c, you'll note that locations are
not really compared:

/* Compare a parse location field (this is a no-op, per note above) */
#define COMPARE_LOCATION_FIELD(fldname) \((void) 0)

where the referenced note is:
* NOTE: it is intentional that parse location fields (in nodes that have* one) are not compared.  This is because we
want,for example, a variable* "x" to be considered equal() to another reference to "x" in the query.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

01 June 2017, 04:06:06

> On May 31, 2017, at 2:49 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Mark Dilger wrote:
>> Hackers,
>>
>> recent changes have introduced the :location field to the partboundspec
>> in pg_catalog.pg_class.  This means that if two identical tables with
>> identical partitioning scheme are created, but one is done before a change
>> to gram.y, and the other after a change to gram.y, the relpartbound fields
>> for those two tables could show up as different.
>
> Actually, if you look at equalfuncs.c, you'll note that locations are
> not really compared:
>
> /* Compare a parse location field (this is a no-op, per note above) */
> #define COMPARE_LOCATION_FIELD(fldname) \
>     ((void) 0)
>
> where the referenced note is:
>
> * NOTE: it is intentional that parse location fields (in nodes that have
> * one) are not compared.  This is because we want, for example, a variable
> * "x" to be considered equal() to another reference to "x" in the query.

That's cold comfort, given that most users will be looking at the pg_class
table and not writing C code that compares Node objects.  I wrote a bit of
regression test logic that checks, and sure enough the relpartbound field
shows up as unequal:
              relpartbound                                                -------------------------------------------- 
SELECT a.relpartbound, b.relpartbound, a.relpartbound = b.relpartbound, a.relpartbound::text = b.relpartbound::text
FROMpg_class a, pg_class b   WHERE a.relname = 'acct_partitioned_1'     AND b.relname = 'acct_partitioned_2';
                                                                                                       relpartbound
                                                                                                              |
                                                                                                          relpartbound
                                                                                                                 |
?column?| ?column? 

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------{PARTITIONBOUNDSPEC
:strategyl :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0 :constlen 2 :constbyval true
:constisnullfalse :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <> :upperdatums <> :location 82} |
{PARTITIONBOUNDSPEC:strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0 :constlen 2
:constbyvaltrue :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <> :upperdatums <>
:location73} | f        | f       
(1 row)

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Andres Freund

Date:

01 June 2017, 04:17:45

On 2017-05-31 15:06:06 -0700, Mark Dilger wrote:
> That's cold comfort, given that most users will be looking at the pg_class
> table and not writing C code that compares Node objects.  I wrote a bit of
> regression test logic that checks, and sure enough the relpartbound field
> shows up as unequal:
>
relpartbound                                               --------------------------------------------
 
> SELECT a.relpartbound, b.relpartbound, a.relpartbound = b.relpartbound, a.relpartbound::text = b.relpartbound::text
>     FROM pg_class a, pg_class b
>     WHERE a.relname = 'acct_partitioned_1'
>       AND b.relname = 'acct_partitioned_2';
>
relpartbound
      |
  relpartbound
         | ?column? | ?column?
 
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------
>  {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0 :constlen 2
:constbyvaltrue :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <> :upperdatums <>
:location82} | {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0
:constlen2 :constbyval true :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <>
:upperdatums<> :location 73} | f        | f      
 
> (1 row)

Normal users aren't going to make sense of node trees in the first
place.  You should use pg_get_expr for it:
postgres[3008][1]=# SELECT pg_get_expr(relpartbound, oid) FROM pg_class WHERE relpartbound IS NOT NULL;
┌──────────────────────┐
│     pg_get_expr      │
├──────────────────────┤
│ FOR VALUES IN (1, 2) │
└──────────────────────┘
(1 row)

- Andres

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

01 June 2017, 04:38:58

> On May 31, 2017, at 3:17 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2017-05-31 15:06:06 -0700, Mark Dilger wrote:
>> That's cold comfort, given that most users will be looking at the pg_class
>> table and not writing C code that compares Node objects.  I wrote a bit of
>> regression test logic that checks, and sure enough the relpartbound field
>> shows up as unequal:
>>
relpartbound                                               -------------------------------------------- 
>> SELECT a.relpartbound, b.relpartbound, a.relpartbound = b.relpartbound, a.relpartbound::text = b.relpartbound::text
>>    FROM pg_class a, pg_class b
>>    WHERE a.relname = 'acct_partitioned_1'
>>      AND b.relname = 'acct_partitioned_2';
>>
relpartbound                                                            |
                                                                    relpartbound
                  | ?column? | ?column? 
>>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------
>> {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0 :constlen 2
:constbyvaltrue :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <> :upperdatums <>
:location82} | {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0
:constlen2 :constbyval true :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <>
:upperdatums<> :location 73} | f        | f       
>> (1 row)
>
> Normal users aren't going to make sense of node trees in the first
> place.  You should use pg_get_expr for it:
> postgres[3008][1]=# SELECT pg_get_expr(relpartbound, oid) FROM pg_class WHERE relpartbound IS NOT NULL;
> ┌──────────────────────┐
> │     pg_get_expr      │
> ├──────────────────────┤
> │ FOR VALUES IN (1, 2) │
> └──────────────────────┘
> (1 row)

I concede that mitigates the problem somewhat, though I still think a user may look
at pg_class, see there is a column that appears to show the partition boundaries,
and then decide to check whether two tables have the same partition boundaries
by comparing those fields, without passing them first through pg_get_expr(), a
function they may never have heard of.

To me, it seems odd to immortalize a SQL parsing field in the catalog definition of
the relation, but perhaps that's just my peculiar sensibilities.  If the community is more
on your side, I'm not going to argue it.

Mark Dilger

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Andres Freund

Date:

01 June 2017, 04:42:47

On 2017-05-31 15:38:58 -0700, Mark Dilger wrote:
> > Normal users aren't going to make sense of node trees in the first
> > place.  You should use pg_get_expr for it:
> > postgres[3008][1]=# SELECT pg_get_expr(relpartbound, oid) FROM pg_class WHERE relpartbound IS NOT NULL;
> > ┌──────────────────────┐
> > │     pg_get_expr      │
> > ├──────────────────────┤
> > │ FOR VALUES IN (1, 2) │
> > └──────────────────────┘
> > (1 row)
> 
> I concede that mitigates the problem somewhat, though I still think a user may look
> at pg_class, see there is a column that appears to show the partition boundaries,
> and then decide to check whether two tables have the same partition boundaries
> by comparing those fields, without passing them first through pg_get_expr(), a
> function they may never have heard of.
> 
> To me, it seems odd to immortalize a SQL parsing field in the catalog definition of
> the relation, but perhaps that's just my peculiar sensibilities.  If the community is more
> on your side, I'm not going to argue it.

Given that various other node trees stored in the catalog also have
location fields, about which I recall no complaints, I don't think this
is a significant issue.  Nor something that I think makes sense to solve
in isolation, without tackling all stored expressions.

- Andres

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

01 June 2017, 04:46:06

> On May 31, 2017, at 3:42 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2017-05-31 15:38:58 -0700, Mark Dilger wrote:
>>> Normal users aren't going to make sense of node trees in the first
>>> place.  You should use pg_get_expr for it:
>>> postgres[3008][1]=# SELECT pg_get_expr(relpartbound, oid) FROM pg_class WHERE relpartbound IS NOT NULL;
>>> ┌──────────────────────┐
>>> │     pg_get_expr      │
>>> ├──────────────────────┤
>>> │ FOR VALUES IN (1, 2) │
>>> └──────────────────────┘
>>> (1 row)
>>
>> I concede that mitigates the problem somewhat, though I still think a user may look
>> at pg_class, see there is a column that appears to show the partition boundaries,
>> and then decide to check whether two tables have the same partition boundaries
>> by comparing those fields, without passing them first through pg_get_expr(), a
>> function they may never have heard of.
>>
>> To me, it seems odd to immortalize a SQL parsing field in the catalog definition of
>> the relation, but perhaps that's just my peculiar sensibilities.  If the community is more
>> on your side, I'm not going to argue it.
>
> Given that various other node trees stored in the catalog also have
> location fields, about which I recall no complaints, I don't think this
> is a significant issue.  Nor something that I think makes sense to solve
> in isolation, without tackling all stored expressions.

Ok.  Just to clarify, I didn't come up with this problem as part of some general
code review.  I merged the recent changes into my tree, and my regression
tests broke.  That's fine; that's what they are there to do.  Break, and in so doing,
alert me to the fact that the project has changed things that will require me to
make modifications.  It just seemed strange to me that I should be changing stuff
to try to get the :location field to be equal in my code to the :location field in the
public sources, since it seems to serve no purpose.

Mark Dilger

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Alvaro Herrera

Date:

01 June 2017, 05:00:46

Mark Dilger wrote:

> >>
relpartbound                                                            |
                                                                    relpartbound
                  | ?column? | ?column?
 
> >>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------
> >> {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0 :constlen 2
:constbyvaltrue :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <> :upperdatums <>
:location82} | {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0
:constlen2 :constbyval true :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <>
:upperdatums<> :location 73} | f        | f      
 
> >> (1 row)

> I concede that mitigates the problem somewhat, though I still think a user may look
> at pg_class, see there is a column that appears to show the partition boundaries,
> and then decide to check whether two tables have the same partition boundaries
> by comparing those fields, without passing them first through pg_get_expr(), a
> function they may never have heard of.

How do you expect that the used would actually interpret the above
weird-looking value?  There's no way to figure out what operations are
being done in that ugly blob of text.

I suspect it would be better to reduce the location field to -1 as
proposed, though, since the location has no actual meaning once the node
is stored standalone rather than as part of a larger command.  However
it's clear that we're not consistent about doing this elsewhere.

> To me, it seems odd to immortalize a SQL parsing field in the catalog definition of
> the relation, but perhaps that's just my peculiar sensibilities.  If the community is more
> on your side, I'm not going to argue it.


-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

01 June 2017, 05:11:58

> On May 31, 2017, at 4:00 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Mark Dilger wrote:
>
>>>>
relpartbound                                                            |
                                                                    relpartbound
                  | ?column? | ?column? 
>>>>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------
>>>> {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0 :constlen 2
:constbyvaltrue :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <> :upperdatums <>
:location82} | {PARTITIONBOUNDSPEC :strategy l :listdatums ({CONST :consttype 23000 :consttypmod -1 :constcollid 0
:constlen2 :constbyval true :constisnull false :location -1 :constvalue 2 [ 0 0 0 0 0 0 0 0 ]}) :lowerdatums <>
:upperdatums<> :location 73} | f        | f       
>>>> (1 row)
>
>> I concede that mitigates the problem somewhat, though I still think a user may look
>> at pg_class, see there is a column that appears to show the partition boundaries,
>> and then decide to check whether two tables have the same partition boundaries
>> by comparing those fields, without passing them first through pg_get_expr(), a
>> function they may never have heard of.
>
> How do you expect that the used would actually interpret the above
> weird-looking value?  There's no way to figure out what operations are
> being done in that ugly blob of text.

I don't know how the average user would use it.  I can only tell you what
I am doing, which may be on the fringe of what users do typically.

I have modified the system to add a number of catalog tables not normally
present in postgres.  A few of those catalog tables are partitioned.  Since
they have to be set up at bootstrap time, I can't use the CREATE TABLE
syntax in some src/backend/catalog/*.sql file to create them, I have to
create them with header files, genbki.pl and friends.  There is no support
for this, so I created my own.  That puts me at risk of getting out of sync
with the public sources implementation of partitioning.  As such, I have
regression tests that create at run time partitioned tables that have all the
same columns and boundaries as my catalog tables, and I compare the
pg_class entries against each other to make sure there are no inconsistencies.
When you guys commit changes that impact partitioning, I notice, and change
my code to match.  But in this case, it seemed to me the change that got
committed was not thought through, and it might benefit the community for
me to point it out, rather than quietly make my code behave the same as
what got committed.

I doubt too many people will follow in my footsteps here, but other people
may hit this :location peculiarity for other reasons.

Once again, this is just to give you context about why I said anything in the
first place.

> I suspect it would be better to reduce the location field to -1 as
> proposed, though, since the location has no actual meaning once the node
> is stored standalone rather than as part of a larger command.  However
> it's clear that we're not consistent about doing this elsewhere.

I have no opinion about whether this should be done for 10.0, given the
release schedule.  Changing it for 10.0 or 11.0 seems reasonable to me.

Mark Dilger

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Amit Langote

Date:

01 June 2017, 07:32:35

On 2017/06/01 4:50, Robert Haas wrote:
> On Wed, May 31, 2017 at 3:40 PM, Mark Dilger <hornschnorter@gmail.com> wrote:
>> recent changes have introduced the :location field to the partboundspec
>> in pg_catalog.pg_class.  This means that if two identical tables with
>> identical partitioning scheme are created, but one is done before a change
>> to gram.y, and the other after a change to gram.y, the relpartbound fields
>> for those two tables could show up as different.
>>
>> Can we perhaps remove the :location field here?  If not, could somebody
>> please defend why this belongs in the catalog entries for the table?  Sorry
>> if I am missing something obvious...

I now think it's kind of annoying too, although not exactly unprecedented
as others have pointed out.  As you well might know, the location field in
the parse node is to position the error cursor at the correct point in the
erring command text.  Although, the node structure that helps do that and
one that we need to store into the catalog do not have to be same, as
Robert suggests in his reply.  Doing that division will again have to
break the catalog though.

By the way, didn't you first have to come across that?  The commit
(80f583ffe93) that introduced location field to partboundspec also bumped
up the catalog version, so you would have to reinitialize the database
directory anyway.

> Yeah, that probably wasn't a good decision, although calling a
> decision might be giving it too much credit.  I think the easiest
> thing to do here would be to change transformPartitionBound() to set
> result_spec->location to -1, although maybe a better idea would be to
> have two different structures -- one that represents the partition
> bound specification *before* parse analysis and another that
> represents it *after* parse analysis, rather than reusing the same
> structure for both.  Then again, maybe making two different node types
> for this is overkill.  Not sure.  Opinions?

I find the two node structures idea interesting, though it would require
breaking the catalog again.

Thanks,
Amit

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

01 June 2017, 08:13:11

> On May 31, 2017, at 6:32 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 2017/06/01 4:50, Robert Haas wrote:
>> On Wed, May 31, 2017 at 3:40 PM, Mark Dilger <hornschnorter@gmail.com> wrote:
>>> recent changes have introduced the :location field to the partboundspec
>>> in pg_catalog.pg_class.  This means that if two identical tables with
>>> identical partitioning scheme are created, but one is done before a change
>>> to gram.y, and the other after a change to gram.y, the relpartbound fields
>>> for those two tables could show up as different.
>>>
>>> Can we perhaps remove the :location field here?  If not, could somebody
>>> please defend why this belongs in the catalog entries for the table?  Sorry
>>> if I am missing something obvious...
>
> I now think it's kind of annoying too, although not exactly unprecedented
> as others have pointed out.  As you well might know, the location field in
> the parse node is to position the error cursor at the correct point in the
> erring command text.

I knew about the location field already, but not that it was already occurring
elsewhere in the catalogs.  I just never paid attention to any of the columns
that are doing so.  Alvaro's criticisms of my complaint were quite informative.
(Thanks, Alvaro.)  I think standardizing the location field to -1 at some point
in all such places would be a good idea, though I am not sufficiently qualified
to say if that should be in 10 or 11, nor whether doing so might cause
backwards compatibility issues, nor whether those would be too much cost
to justify the changes.

> By the way, didn't you first have to come across that?  The commit
> (80f583ffe93) that introduced location field to partboundspec also bumped
> up the catalog version, so you would have to reinitialize the database
> directory anyway.

I don't have my work in production yet.  Each time I merge changes into
my repository, I run "make check-world", and that of course re-initializes
the data directories for the test databases.

Mark Dilger

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Amit Langote

Date:

01 June 2017, 08:50:34

On 2017/06/01 11:13, Mark Dilger wrote:
>> On May 31, 2017, at 6:32 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>
>> On 2017/06/01 4:50, Robert Haas wrote:
>>> On Wed, May 31, 2017 at 3:40 PM, Mark Dilger <hornschnorter@gmail.com> wrote:
>>>> recent changes have introduced the :location field to the partboundspec
>>>> in pg_catalog.pg_class.  This means that if two identical tables with
>>>> identical partitioning scheme are created, but one is done before a change
>>>> to gram.y, and the other after a change to gram.y, the relpartbound fields
>>>> for those two tables could show up as different.
>>>>
>>>> Can we perhaps remove the :location field here?  If not, could somebody
>>>> please defend why this belongs in the catalog entries for the table?  Sorry
>>>> if I am missing something obvious...
>>
>> I now think it's kind of annoying too, although not exactly unprecedented
>> as others have pointed out.  As you well might know, the location field in
>> the parse node is to position the error cursor at the correct point in the
>> erring command text. 
> 
> I knew about the location field already, but not that it was already occurring
> elsewhere in the catalogs.  I just never paid attention to any of the columns
> that are doing so.  Alvaro's criticisms of my complaint were quite informative.
> (Thanks, Alvaro.)  I think standardizing the location field to -1 at some point
> in all such places would be a good idea, though I am not sufficiently qualified
> to say if that should be in 10 or 11, nor whether doing so might cause
> backwards compatibility issues, nor whether those would be too much cost
> to justify the changes.

Setting the location field of parse nodes to -1 before writing into the
catalog for *all* node types might defeat the purpose of writing the
actual value, which as written in the header comment of readfuncs.c is:
*    Parse location fields are written out by outfuncs.c, but only for*    possible debugging use.  When reading a
locationfield, we discard*    the stored value and set the location field to -1 (ie, "unknown").
 

So apparently, seeing the actual location value might be useful to some.
There may not be that many users who *do* use it for anything though.

Thanks,
Amit

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Tom Lane

Date:

01 June 2017, 19:31:19

Mark Dilger <hornschnorter@gmail.com> writes:
> When you guys commit changes that impact partitioning, I notice, and change
> my code to match.  But in this case, it seemed to me the change that got
> committed was not thought through, and it might benefit the community for
> me to point it out, rather than quietly make my code behave the same as
> what got committed.

Let me explain the project standards in words of one syllable: user code
should not examine the contents of node trees.  That's what pg_get_expr
is for.  There is not, never has been, and never will be any guarantee
that we won't whack those structures around in completely arbitrary ways,
as long as we do a catversion bump along with it.
        regards, tom lane

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

01 June 2017, 19:51:29

> On Jun 1, 2017, at 6:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Mark Dilger <hornschnorter@gmail.com> writes:
>> When you guys commit changes that impact partitioning, I notice, and change
>> my code to match.  But in this case, it seemed to me the change that got
>> committed was not thought through, and it might benefit the community for
>> me to point it out, rather than quietly make my code behave the same as
>> what got committed.
>
> Let me explain the project standards in words of one syllable: user code
> should not examine the contents of node trees.  That's what pg_get_expr
> is for.  There is not, never has been, and never will be any guarantee
> that we won't whack those structures around in completely arbitrary ways,
> as long as we do a catversion bump along with it.

I have already dealt with this peculiarity and don't care much at this point, but
I think your characterization of my position is inaccurate:

I'm really not talking about examining the contents of node trees.  I'm only
talking about comparing two of them for equality, and getting false as an answer,
when they are actually equal.  Asking "Is A == B" is not "examining the contents",
it is asking the project supplied comparator function to examine the contents,
which is on par with asking the project supplied pg_get_expr function to do so.
There is currently only one production in gram.y in the public sources that
creates partitions, so you don't notice the relpartbound being dependent on
which grammar production defined the table.  I notice, and think it is an odd
thing to encode in the the relpartbound field.  Bumping the catversion is irrelevant
if you have two or more different productions in gram.y that give rise to these
field values.  Apparently, you don't care.  Fine by me.  It just seemed to me an
oversight when I first encountered it, rather than something anybody would
intentionally commit to the sources.

Mark Dilger

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Robert Haas

Date:

02 June 2017, 22:48:35

On Thu, Jun 1, 2017 at 9:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Let me explain the project standards in words of one syllable: user code
> should not examine the contents of node trees.  That's what pg_get_expr
> is for.  There is not, never has been, and never will be any guarantee
> that we won't whack those structures around in completely arbitrary ways,
> as long as we do a catversion bump along with it.

Many of those words have more than one syllable.

Also, you're attacking a straw man. Accidentally storing a meaningless
parse location in the catalog isn't a feature, and we shouldn't
pretend otherwise.  I agree that what Mark's doing is a bit unusual
and doesn't necessarily need to work, and he seems to agree with that,
too.  That doesn't mean that the status quo is some brilliant piece of
engineering.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Tom Lane

Date:

02 June 2017, 22:59:45

Robert Haas <robertmhaas@gmail.com> writes:
> Also, you're attacking a straw man. Accidentally storing a meaningless
> parse location in the catalog isn't a feature, and we shouldn't
> pretend otherwise.

It's not meaningless, and it is a feature, useful for debugging.
Otherwise you'd have a hard time telling one occurrence of e.g. "+" from
another when you're trying to make sense of a stored tree.  We worked out
all this behavior ages ago for other expression node trees that are stored
in the catalogs (default expressions, index expressions, check
constraints, etc etc) and I don't see a reason for partition expressions
to be different.

I agree that this means you can't just strcmp a couple of stored node tree
strings to decide if they're equal, but you couldn't anyway --- a moment's
perusal of equalfuncs.c will show you other special cases, each with their
own rationale.  Maybe you'd like to complain that every one of those
rationales is wrongheaded but I do not think you will get far.

I think the best advice for Mark is to look at pg_get_expr() output and
see if that matches.
        regards, tom lane

Re: [HACKERS] pg_class.relpartbound definition overly brittle

From

Mark Dilger

Date:

03 June 2017, 00:16:46

> On Jun 2, 2017, at 9:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> Robert Haas <robertmhaas@gmail.com> writes:
>> Also, you're attacking a straw man. Accidentally storing a meaningless
>> parse location in the catalog isn't a feature, and we shouldn't
>> pretend otherwise.
> 
> It's not meaningless, and it is a feature, useful for debugging.
> Otherwise you'd have a hard time telling one occurrence of e.g. "+" from
> another when you're trying to make sense of a stored tree.  We worked out
> all this behavior ages ago for other expression node trees that are stored
> in the catalogs (default expressions, index expressions, check
> constraints, etc etc) and I don't see a reason for partition expressions
> to be different.

Ok, that makes more sense now.  Thanks for clarifying the history of how
and why the location field from parsing is getting stored.

> I agree that this means you can't just strcmp a couple of stored node tree
> strings to decide if they're equal, but you couldn't anyway --- a moment's
> perusal of equalfuncs.c will show you other special cases, each with their
> own rationale.  Maybe you'd like to complain that every one of those
> rationales is wrongheaded but I do not think you will get far.
> 
> I think the best advice for Mark is to look at pg_get_expr() output and
> see if that matches.

Yes, I'm already doing that based on the discussion so far.

Mark Dilger