Thread: remove pg_class.relhaspkey
pg_class.relhaspkey doesn't seem to be used or useful for anything, so can we remove it? See attached patch. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > pg_class.relhaspkey doesn't seem to be used or useful for anything, so > can we remove it? See attached patch. We've discussed that at least twice before, and not pulled the trigger for fear of breaking client code. https://www.postgresql.org/message-id/flat/CAA-aLv7sszmU%2BFz-xLo6cOiASUiX0mCRwAMF2FB%3D2j-5MKqb%2BA%40mail.gmail.com https://www.postgresql.org/message-id/flat/20140317185255.20724.49675%40wrigleys.postgresql.org Not sure that the situation has changed any since then ... it still comes down to whether we want an API break for client code. regards, tom lane
On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > > pg_class.relhaspkey doesn't seem to be used or useful for anything, so > > can we remove it? See attached patch. > > We've discussed that at least twice before, and not pulled the trigger > for fear of breaking client code. > > https://www.postgresql.org/message-id/flat/CAA-aLv7sszmU%2BFz-xLo6cOiASUiX0mCRwAMF2FB%3D2j-5MKqb%2BA%40mail.gmail.com > https://www.postgresql.org/message-id/flat/20140317185255.20724.49675%40wrigleys.postgresql.org > > Not sure that the situation has changed any since then ... it still > comes down to whether we want an API break for client code. I would be of the opinion to drop them. Even if some client code rely on this information, it means that they also rely on some information which may be incorrect if those are not updated when they should be for certain DDL combinations. And in my opinion that's bad. It seems to me that if we would want to do some cleanup, this should happen for all the flags that are updated in a lazy fashion. For most of them, it would be easy enough to replace them by subqueries which scan other catalog information for correct and actually up-to-date information. So in this category enter as well relhassubclass, relhastriggers and relhasrules. Speaking of which, I have looked at where relhaspkey is being used. And there are a couple of things using it: - Greenplum has a consistency checker tool using it. - https://github.com/no0p/pgsampler - https://searchcode.com/codesearch/view/54937539/ - http://codegist.net/code/postgres%20drop%20tables/ - https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs So the answer is yes, there would be code breakages if those get removed. Still I would imagine that such applications ought to use subquery patterns and not rely on the existing flags, as this causes in the worst case extra round trips between the server and the client as the client is unsure if the information is up-to-date or not. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote: >> We've discussed that at least twice before, and not pulled the trigger >> for fear of breaking client code. > Speaking of which, I have looked at where relhaspkey is being used. And > there are a couple of things using it: > - Greenplum has a consistency checker tool using it. > - https://github.com/no0p/pgsampler > - https://searchcode.com/codesearch/view/54937539/ > - http://codegist.net/code/postgres%20drop%20tables/ > - https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs Thanks for poking around. Did you happen to notice how many of these clients are taking pains to deal with the existing inaccuracy of relhaspkey (ie, that it remains set after the pkey has been dropped)? I think there's possibly an argument that relhaspkey should be dropped because it's an attractive nuisance, encouraging clients to assume more than they should about what it means. But we don't have a lot of evidence for such an argument right now. regards, tom lane
On Mon, Feb 26, 2018 at 12:45:48AM -0500, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: > > On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote: > >> We've discussed that at least twice before, and not pulled the trigger > >> for fear of breaking client code. > > > Speaking of which, I have looked at where relhaspkey is being used. And > > there are a couple of things using it: > > - Greenplum has a consistency checker tool using it. > > - https://github.com/no0p/pgsampler > > - https://searchcode.com/codesearch/view/54937539/ > > - http://codegist.net/code/postgres%20drop%20tables/ > > - https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs > > Thanks for poking around. Did you happen to notice how many of these > clients are taking pains to deal with the existing inaccuracy of > relhaspkey (ie, that it remains set after the pkey has been dropped)? As far as I looked at things. Those clients rely on how optimistic relhaspkey is. In short, if it is set to true, there can be primary key definitions. If set to false, then it is sure that no primary key definition can be found. If the flag is true, then those clients just do an extra lookup on pg_index with indisprimary. I think that this just complicates the code involved though. If looking for primary keys it is way better to just scan directly pg_index. > I think there's possibly an argument that relhaspkey should be dropped > because it's an attractive nuisance, encouraging clients to assume > more than they should about what it means. But we don't have a lot > of evidence for such an argument right now. The only breakage I could imagine here is an application which thinks relhaspkey set to true implies that a primary key *has* to be present. I have to admit that I have not found such a case. Still I would not be surprised if there are such applications unaware of being broken, particularly plpgsql functions. -- Michael
Attachment
On Mon, Feb 26, 2018 at 12:36 AM, Michael Paquier <michael@paquier.xyz> wrote: > I would be of the opinion to drop them. +1. On this point, I am in agreement with the gentleman who wrote http://postgr.es/m/7903.1310671199@sss.pgh.pa.us -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/26/2018 07:23 AM, Michael Paquier wrote: > On Mon, Feb 26, 2018 at 12:45:48AM -0500, Tom Lane wrote: >> Michael Paquier <michael@paquier.xyz> writes: >>> On Sat, Feb 24, 2018 at 10:21:44PM -0500, Tom Lane wrote: >>>> We've discussed that at least twice before, and not pulled the trigger >>>> for fear of breaking client code. >> >>> Speaking of which, I have looked at where relhaspkey is being used. And >>> there are a couple of things using it: >>> - Greenplum has a consistency checker tool using it. >>> - https://github.com/no0p/pgsampler >>> - https://searchcode.com/codesearch/view/54937539/ >>> - http://codegist.net/code/postgres%20drop%20tables/ >>> - https://hackage.haskell.org/package/relational-schemas-0.1.3.4/src/src/Database/Relational/Schema/PgCatalog/PgClass.hs >> >> Thanks for poking around. Did you happen to notice how many of these >> clients are taking pains to deal with the existing inaccuracy of >> relhaspkey (ie, that it remains set after the pkey has been dropped)? > > As far as I looked at things. Those clients rely on how optimistic > relhaspkey is. In short, if it is set to true, there can be primary > key definitions. If set to false, then it is sure that no primary key > definition can be found. If the flag is true, then those clients just > do an extra lookup on pg_index with indisprimary. I think that this > just complicates the code involved though. If looking for primary keys > it is way better to just scan directly pg_index. > >> I think there's possibly an argument that relhaspkey should be dropped >> because it's an attractive nuisance, encouraging clients to assume >> more than they should about what it means. But we don't have a lot >> of evidence for such an argument right now. > > The only breakage I could imagine here is an application which > thinks relhaspkey set to true implies that a primary key *has* to be > present. I have to admit that I have not found such a case. Still I > would not be surprised if there are such applications unaware of > being broken, particularly plpgsql functions. I agree with this sentiment - I don't think those flags are particularly helpful for client applications, and would vote +1 for removal. Even if the application handles them correctly (i.e. rechecks existence when relhaspkey=true), the assumption is that this saves a measurable amount of work. I'm not so sure about that, considering pretty much all tables have both primary keys and indexes. OTOH it certainly does make the code more complicated. For internal usage it might have made a difference back when those flags were introduced, but the relcache should deal with this efficiently now, as pointed out in [1]. But as pointed out, we're not using relhaspkey internally at all. So +1 to get rid of it. For the other flags we would probably need to test what impact would it have (e.g. table with no indexes, many indexes on other tables, and something calling get_relation_info a lot). But this patch proposes to remove only relhaspkey. [1] https://www.postgresql.org/message-id/CA%2BTgmoYJu24Y8uUAJ4zeQAhoYxLmFxcy%2B5Hdij9ehjoxKo3j3g%40mail.gmail.com regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Mar 10, 2018 at 01:52:56PM +0100, Tomas Vondra wrote: > I agree with this sentiment - I don't think those flags are particularly > helpful for client applications, and would vote +1 for removal. OK, so I can see that we are moving to a consensus here. > For the other flags we would probably need to test what impact would it > have (e.g. table with no indexes, many indexes on other tables, and > something calling get_relation_info a lot). But this patch proposes to > remove only relhaspkey. Yes, you are right here. Peter, would you do that within this commit fest or not? As we are half-way through the commit fest, we could also cut the apple in half and just remove relhaspkey for now as that's a no-brainer. So I would suggest to just do the latter and consider this patch as done. Attached is a rebased patch, there were some conflicts in pg_class.h by the way. -- Michael
Attachment
On 3/14/18 01:47, Michael Paquier wrote: >> For the other flags we would probably need to test what impact would it >> have (e.g. table with no indexes, many indexes on other tables, and >> something calling get_relation_info a lot). But this patch proposes to >> remove only relhaspkey. > > Yes, you are right here. Peter, would you do that within this commit > fest or not? Not terribly interested in those other ones. > As we are half-way through the commit fest, we could also > cut the apple in half and just remove relhaspkey for now as that's a > no-brainer. So I would suggest to just do the latter and consider this > patch as done. done -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Mar 14, 2018 at 03:36:28PM -0400, Peter Eisentraut wrote: > done Thanks Peter. One done, 150 remaining. -- Michael