Thread: remove pg_class.relhaspkey

remove pg_class.relhaspkey

From
Peter Eisentraut
Date:
pg_class.relhaspkey doesn't seem to be used or useful for anything, so
can we remove it?  See attached patch.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: remove pg_class.relhaspkey

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> pg_class.relhaspkey doesn't seem to be used or useful for anything, so
> can we remove it?  See attached patch.

We've discussed that at least twice before, and not pulled the trigger
for fear of breaking client code.

https://www.postgresql.org/message-id/flat/CAA-aLv7sszmU%2BFz-xLo6cOiASUiX0mCRwAMF2FB%3D2j-5MKqb%2BA%40mail.gmail.com

https://www.postgresql.org/message-id/flat/20140317185255.20724.49675%40wrigleys.postgresql.org

Not sure that the situation has changed any since then ... it still
comes down to whether we want an API break for client code.

            regards, tom lane


Re: remove pg_class.relhaspkey

From
Michael Paquier
Date:
On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> > pg_class.relhaspkey doesn't seem to be used or useful for anything, so
> > can we remove it?  See attached patch.
>
> We've discussed that at least twice before, and not pulled the trigger
> for fear of breaking client code.
>
> https://www.postgresql.org/message-id/flat/CAA-aLv7sszmU%2BFz-xLo6cOiASUiX0mCRwAMF2FB%3D2j-5MKqb%2BA%40mail.gmail.com
> https://www.postgresql.org/message-id/flat/20140317185255.20724.49675%40wrigleys.postgresql.org
>
> Not sure that the situation has changed any since then ... it still
> comes down to whether we want an API break for client code.

I would be of the opinion to drop them.  Even if some client code rely
on this information, it means that they also rely on some information
which may be incorrect if those are not updated when they should be for
certain DDL combinations.  And in my opinion that's bad.  It seems to me
that if we would want to do some cleanup, this should happen for all the
flags that are updated in a lazy fashion.  For most of them, it would be
easy enough to replace them by subqueries which scan other catalog
information for correct and actually up-to-date information.  So in this
category enter as well relhassubclass, relhastriggers and relhasrules.

Speaking of which, I have looked at where relhaspkey is being used.  And
there are a couple of things using it:
- Greenplum has a consistency checker tool using it.
- https://github.com/no0p/pgsampler
- https://searchcode.com/codesearch/view/54937539/
- http://codegist.net/code/postgres%20drop%20tables/
-
https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs
So the answer is yes, there would be code breakages if those get
removed.  Still I would imagine that such applications ought to use
subquery patterns and not rely on the existing flags, as this causes in
the worst case extra round trips between the server and the client as
the client is unsure if the information is up-to-date or not.
--
Michael

Attachment

Re: remove pg_class.relhaspkey

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote:
>> We've discussed that at least twice before, and not pulled the trigger
>> for fear of breaking client code.

> Speaking of which, I have looked at where relhaspkey is being used.  And
> there are a couple of things using it:
> - Greenplum has a consistency checker tool using it.
> - https://github.com/no0p/pgsampler
> - https://searchcode.com/codesearch/view/54937539/
> - http://codegist.net/code/postgres%20drop%20tables/
> -
https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs

Thanks for poking around.  Did you happen to notice how many of these
clients are taking pains to deal with the existing inaccuracy of
relhaspkey (ie, that it remains set after the pkey has been dropped)?

I think there's possibly an argument that relhaspkey should be dropped
because it's an attractive nuisance, encouraging clients to assume
more than they should about what it means.  But we don't have a lot
of evidence for such an argument right now.

            regards, tom lane


Re: remove pg_class.relhaspkey

From
Michael Paquier
Date:
On Mon, Feb 26, 2018 at 12:45:48AM -0500, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote:
> >> We've discussed that at least twice before, and not pulled the trigger
> >> for fear of breaking client code.
>
> > Speaking of which, I have looked at where relhaspkey is being used.  And
> > there are a couple of things using it:
> > - Greenplum has a consistency checker tool using it.
> > - https://github.com/no0p/pgsampler
> > - https://searchcode.com/codesearch/view/54937539/
> > - http://codegist.net/code/postgres%20drop%20tables/
> > -
https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs
>
> Thanks for poking around.  Did you happen to notice how many of these
> clients are taking pains to deal with the existing inaccuracy of
> relhaspkey (ie, that it remains set after the pkey has been dropped)?

As far as I looked at things.  Those clients rely on how optimistic
relhaspkey is.  In short, if it is set to true, there can be primary
key definitions.  If set to false, then it is sure that no primary key
definition can be found.  If the flag is true, then those clients just
do an extra lookup on pg_index with indisprimary.  I think that this
just complicates the code involved though.  If looking for primary keys
it is way better to just scan directly pg_index.

> I think there's possibly an argument that relhaspkey should be dropped
> because it's an attractive nuisance, encouraging clients to assume
> more than they should about what it means.  But we don't have a lot
> of evidence for such an argument right now.

The only breakage I could imagine here is an application which thinks
relhaspkey set to true implies that a primary key *has* to be present.
I have to admit that I have not found such a case.  Still I would not be
surprised if there are such applications unaware of being broken,
particularly plpgsql functions.
--
Michael

Attachment

Re: remove pg_class.relhaspkey

From
Robert Haas
Date:
On Mon, Feb 26, 2018 at 12:36 AM, Michael Paquier <michael@paquier.xyz> wrote:
> I would be of the opinion to drop them.

+1.  On this point, I am in agreement with the gentleman who wrote
http://postgr.es/m/7903.1310671199@sss.pgh.pa.us

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: remove pg_class.relhaspkey

From
Tomas Vondra
Date:

On 02/26/2018 07:23 AM, Michael Paquier wrote:
> On Mon, Feb 26, 2018 at 12:45:48AM -0500, Tom Lane wrote:
>> Michael Paquier <michael@paquier.xyz> writes:
>>> On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote:
>>>> We've discussed that at least twice before, and not pulled the trigger
>>>> for fear of breaking client code.
>>
>>> Speaking of which, I have looked at where relhaspkey is being used.  And
>>> there are a couple of things using it:
>>> - Greenplum has a consistency checker tool using it.
>>> - https://github.com/no0p/pgsampler
>>> - https://searchcode.com/codesearch/view/54937539/
>>> - http://codegist.net/code/postgres%20drop%20tables/
>>> -
https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs
>>
>> Thanks for poking around.  Did you happen to notice how many of these
>> clients are taking pains to deal with the existing inaccuracy of
>> relhaspkey (ie, that it remains set after the pkey has been dropped)?
> 
> As far as I looked at things.  Those clients rely on how optimistic
> relhaspkey is.  In short, if it is set to true, there can be primary
> key definitions.  If set to false, then it is sure that no primary key
> definition can be found.  If the flag is true, then those clients just
> do an extra lookup on pg_index with indisprimary.  I think that this
> just complicates the code involved though.  If looking for primary keys
> it is way better to just scan directly pg_index.
> 
>> I think there's possibly an argument that relhaspkey should be dropped
>> because it's an attractive nuisance, encouraging clients to assume
>> more than they should about what it means.  But we don't have a lot
>> of evidence for such an argument right now.
> 
> The only breakage I could imagine here is an application which
> thinks relhaspkey set to true implies that a primary key *has* to be
> present. I have to admit that I have not found such a case. Still I
> would not be surprised if there are such applications unaware of
> being broken, particularly plpgsql functions.

I agree with this sentiment - I don't think those flags are particularly
helpful for client applications, and would vote +1 for removal.

Even if the application handles them correctly (i.e. rechecks existence
when relhaspkey=true), the assumption is that this saves a measurable
amount of work. I'm not so sure about that, considering pretty much all
tables have both primary keys and indexes. OTOH it certainly does make
the code more complicated.

For internal usage it might have made a difference back when those flags
were introduced, but the relcache should deal with this efficiently now,
as pointed out in [1]. But as pointed out, we're not using relhaspkey
internally at all. So +1 to get rid of it.

For the other flags we would probably need to test what impact would it
have (e.g. table with no indexes, many indexes on other tables, and
something calling get_relation_info a lot). But this patch proposes to
remove only relhaspkey.


[1]
https://www.postgresql.org/message-id/CA%2BTgmoYJu24Y8uUAJ4zeQAhoYxLmFxcy%2B5Hdij9ehjoxKo3j3g%40mail.gmail.com


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: remove pg_class.relhaspkey

From
Michael Paquier
Date:
On Sat, Mar 10, 2018 at 01:52:56PM +0100, Tomas Vondra wrote:
> I agree with this sentiment - I don't think those flags are particularly
> helpful for client applications, and would vote +1 for removal.

OK, so I can see that we are moving to a consensus here.

> For the other flags we would probably need to test what impact would it
> have (e.g. table with no indexes, many indexes on other tables, and
> something calling get_relation_info a lot). But this patch proposes to
> remove only relhaspkey.

Yes, you are right here.  Peter, would you do that within this commit
fest or not?  As we are half-way through the commit fest, we could also
cut the apple in half and just remove relhaspkey for now as that's a
no-brainer.  So I would suggest to just do the latter and consider this
patch as done.

Attached is a rebased patch, there were some conflicts in pg_class.h by
the way.
--
Michael

Attachment

Re: remove pg_class.relhaspkey

From
Peter Eisentraut
Date:
On 3/14/18 01:47, Michael Paquier wrote:
>> For the other flags we would probably need to test what impact would it
>> have (e.g. table with no indexes, many indexes on other tables, and
>> something calling get_relation_info a lot). But this patch proposes to
>> remove only relhaspkey.
> 
> Yes, you are right here.  Peter, would you do that within this commit
> fest or not?

Not terribly interested in those other ones.

> As we are half-way through the commit fest, we could also
> cut the apple in half and just remove relhaspkey for now as that's a
> no-brainer.  So I would suggest to just do the latter and consider this
> patch as done.

done

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: remove pg_class.relhaspkey

From
Michael Paquier
Date:
On Wed, Mar 14, 2018 at 03:36:28PM -0400, Peter Eisentraut wrote:
> done

Thanks Peter.  One done, 150 remaining.
--
Michael

Attachment