Thread: Why don't we have a small reserved OID range for patch revisions?
Why don't we provide a small reserved OID range that can be used by patch authors temporarily, with the expectation that they'll be replaced by "real" OIDs at the point the patch gets committed? This would be similar the situation with catversion bumps -- we don't expect patches that will eventually need them to have them. It's considered good practice to choose an OID that's at the beginning of the range shown by the unused_oids script, so naturally there is a good chance that any patch that adds a system catalog entry will bit rot prematurely. This seems totally unnecessary to me. You could even have a replace_oids script under this system. That would replace the known-temporary OIDs with mapped contiguous real values at the time of commit (maybe it would just print out which permanent OIDs to use in place of the temporary ones, and leave the rest up to the committer). I don't do Perl, so I'm not volunteering for this. -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > Why don't we provide a small reserved OID range that can be used by > patch authors temporarily, with the expectation that they'll be > replaced by "real" OIDs at the point the patch gets committed? This > would be similar the situation with catversion bumps -- we don't > expect patches that will eventually need them to have them. Quite a few people have used OIDs up around 8000 or 9000 for this purpose; I doubt we need a formally reserved range for it. The main problem with doing it is the hazard that the patch'll get committed just like that, suddenly breaking things for everyone else doing likewise. (I would argue, in fact, that the reason we have any preassigned OIDs above perhaps 6000 is that exactly this has happened before.) A script such as you suggest might be a good way to reduce the temptation to get lazy at the last minute. Now that the catalog data is pretty machine-readable, I suspect it wouldn't be very hard --- though I'm not volunteering either. I'm envisioning something simple like "renumber all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the ability to skip any already-used OIDs in the target range. regards, tom lane
On Fri, Feb 8, 2019 at 10:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > A script such as you suggest might be a good way to reduce the temptation > to get lazy at the last minute. Now that the catalog data is pretty > machine-readable, I suspect it wouldn't be very hard --- though I'm > not volunteering either. I'm envisioning something simple like "renumber > all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the > ability to skip any already-used OIDs in the target range. I imagined that the machine-readable catalog data would allow us to assign non-numeric identifiers to this OID range. Perhaps there'd be a textual symbol with a number in the range of 0-20 at the end. Those would stick out like a sore thumb, making it highly unlikely that anybody would forget about it at the last minute. -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Fri, Feb 8, 2019 at 10:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A script such as you suggest might be a good way to reduce the temptation >> to get lazy at the last minute. Now that the catalog data is pretty >> machine-readable, I suspect it wouldn't be very hard --- though I'm >> not volunteering either. I'm envisioning something simple like "renumber >> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the >> ability to skip any already-used OIDs in the target range. > I imagined that the machine-readable catalog data would allow us to > assign non-numeric identifiers to this OID range. Perhaps there'd be a > textual symbol with a number in the range of 0-20 at the end. Those > would stick out like a sore thumb, making it highly unlikely that > anybody would forget about it at the last minute. Um. That would not be just an add-on script but something that genbki.pl would have to accept. I'm not excited about that; it would complicate what's already complex, and if it works enough for test purposes then it wouldn't really stop a committer who wasn't paying attention from committing the patch un-revised. To the extent that this works at all, OIDs in the 9000 range ought to be enough of a flag already, I think. regards, tom lane
On Fri, Feb 8, 2019 at 10:29 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Um. That would not be just an add-on script but something that > genbki.pl would have to accept. I'm not excited about that; it would > complicate what's already complex, and if it works enough for test > purposes then it wouldn't really stop a committer who wasn't paying > attention from committing the patch un-revised. > > To the extent that this works at all, OIDs in the 9000 range ought > to be enough of a flag already, I think. I tend to agree that this isn't enough of a problem to justify making genbki.pl significantly more complicated. -- Peter Geoghegan
On Fri, Feb 8, 2019 at 11:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > To the extent that this works at all, OIDs in the 9000 range ought > to be enough of a flag already, I think. A "flag" that isn't documented anywhere outside of a mailing list discussion and that isn't checked by any code anywhere is not much of a flag, IMHO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2/8/19, Tom Lane <tgl@sss.pgh.pa.us> wrote: > A script such as you suggest might be a good way to reduce the temptation > to get lazy at the last minute. Now that the catalog data is pretty > machine-readable, I suspect it wouldn't be very hard --- though I'm > not volunteering either. I'm envisioning something simple like "renumber > all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the > ability to skip any already-used OIDs in the target range. This might be something that can be done inside reformat_dat_files.pl. It's a little outside it's scope, but better than the alternatives. And we already have a function in Catalog.pm to get the currently used oids. I'll volunteer to look into it but I don't know when that will be. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I wrote: > On 2/8/19, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A script such as you suggest might be a good way to reduce the temptation >> to get lazy at the last minute. Now that the catalog data is pretty >> machine-readable, I suspect it wouldn't be very hard --- though I'm >> not volunteering either. I'm envisioning something simple like "renumber >> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the >> ability to skip any already-used OIDs in the target range. > > This might be something that can be done inside reformat_dat_files.pl. > It's a little outside it's scope, but better than the alternatives. Along those lines, here's a draft patch to do just that. It handles array type oids as well. Run it like this: perl reformat_dat_file.pl --map-from 9000 --map-to 2000 *.dat There is some attempt at documentation. So far it doesn't map by default, but that could be changed if we agreed on the convention of 9000 or whatever. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
I wrote: > Along those lines, here's a draft patch to do just that. It handles > array type oids as well. Run it like this: > > perl reformat_dat_file.pl --map-from 9000 --map-to 2000 *.dat > > There is some attempt at documentation. So far it doesn't map by > default, but that could be changed if we agreed on the convention of > 9000 or whatever. In case we don't want to lose track of this, I added it to the March commitfest with a target of v13. (I didn't see a way to add it to the July commitfest) -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-02-08 19:14, Tom Lane wrote: > Quite a few people have used OIDs up around 8000 or 9000 for this purpose; > I doubt we need a formally reserved range for it. The main problem with > doing it is the hazard that the patch'll get committed just like that, > suddenly breaking things for everyone else doing likewise. For that reason, I'm not in favor of this. Forgetting to update the catversion is already common enough (for me). Adding another step between having a seemingly final patch and being able to actually commit it doesn't seem attractive. Moreover, these "final adjustments" would tend to require a full rebuild and retest, adding even more overhead. OID collision doesn't seem to be a significant problem (for me). -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2019-02-08 19:14, Tom Lane wrote: >> Quite a few people have used OIDs up around 8000 or 9000 for this purpose; >> I doubt we need a formally reserved range for it. The main problem with >> doing it is the hazard that the patch'll get committed just like that, >> suddenly breaking things for everyone else doing likewise. > For that reason, I'm not in favor of this. Forgetting to update the > catversion is already common enough (for me). Adding another step > between having a seemingly final patch and being able to actually commit > it doesn't seem attractive. Moreover, these "final adjustments" would > tend to require a full rebuild and retest, adding even more overhead. > OID collision doesn't seem to be a significant problem (for me). Um, I beg to differ. It's not at all unusual for pending patches to bit-rot for no reason other than suddenly getting an OID conflict. I don't have to look far for a current example: https://travis-ci.org/postgresql-cfbot/postgresql/builds/498955351 We do need a couple of pieces of new infrastructure to make this idea conveniently workable. One is a tool to allow automatic OID renumbering instead of having to do it by hand; Naylor has a draft for that upthread. Perhaps it'd be useful for genbki.pl to spit out a warning (NOT an error) if it sees OIDs in the reserved range. I'm not sure that that'd really be worth the trouble though, since one could easily forget about it while reviewing/testing just before commit, and it'd just be useless noise up until it was time to commit. Another issue, as Robert pointed out, is that this does need to be a formal convention not something undocumented. Naylor's patch adds a mention of it in bki.sgml, but I wonder if anyplace else should talk about it. I concede your point that a prudent committer would do a rebuild and retest rather than just trusting the tool. But really, how much extra work is that? If you've spent any time optimizing your workflow, a full rebuild and check-world should be under five minutes on any hardware anyone would be using for development today. And, yeah, we probably will make mistakes like this, just like we sometimes forget the catversion bump. As long as we have a tool for OID renumbering, I don't think that's the end of the world. Fixing it after the fact isn't going to be a big deal, any more than it is for catversion. regards, tom lane
On Wed, Feb 27, 2019 at 1:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > OID collision doesn't seem to be a significant problem (for me). > > Um, I beg to differ. It's not at all unusual for pending patches to > bit-rot for no reason other than suddenly getting an OID conflict. > I don't have to look far for a current example: > https://travis-ci.org/postgresql-cfbot/postgresql/builds/498955351 OID conflicts are not that big of a deal when building a patch locally, because almost everyone knows what the exact problem is immediately, and because you probably have more than a passing interest in the patch to even do that much. However, the continuous integration stuff has created an expectation that your patch shouldn't be left to bitrot for long. Silly mechanical bitrot now seems like a much bigger problem than it was before these developments. It unfairly puts reviewers off engaging. Patch authors shouldn't be left with any excuse for leaving their patch to bitrot for long. And, more casual patch reviewers shouldn't have any excuse for not downloading a patch and applying it locally, so that they can spend a spare 10 minutes kicking the tires. > Perhaps it'd be useful for genbki.pl to spit out a warning (NOT an > error) if it sees OIDs in the reserved range. I'm not sure that that'd > really be worth the trouble though, since one could easily forget > about it while reviewing/testing just before commit, and it'd just be > useless noise up until it was time to commit. My sense is that we should err on the side of being informative. > Another issue, as Robert pointed out, is that this does need to be > a formal convention not something undocumented. Naylor's patch adds > a mention of it in bki.sgml, but I wonder if anyplace else should > talk about it. Why not have unused_oids reference the convention as a "tip"? > I concede your point that a prudent committer would do a rebuild and > retest rather than just trusting the tool. But really, how much > extra work is that? If you've spent any time optimizing your workflow, > a full rebuild and check-world should be under five minutes on any > hardware anyone would be using for development today. If you use the "check-world parallel" recipe on the committing checklist Wiki page, and if you use ccache, ~2 minutes is attainable for optimized builds (though the recipe doesn't work on all release branches). I don't think that a committer should be a committing anything if they're not willing to do this much. It's not just prudent -- it is the *bare minimum* when committing a patch that creates system catalog entries. -- Peter Geoghegan
I wrote: > We do need a couple of pieces of new infrastructure to make this idea > conveniently workable. One is a tool to allow automatic OID renumbering > instead of having to do it by hand; Naylor has a draft for that upthread. Oh: arguably, something else we'd need to do to ensure that OID renumbering is trouble-free is to institute a strict rule that OID references in the *.dat files must be symbolic. We had not bothered to convert every single reference type before, reasoning that some of them were too little-used to be worth the trouble; but someday that'll rise up to bite us, if semi-automated renumbering becomes a thing. It looks to me like the following OID columns remain unconverted: pg_class.reltype pg_database.dattablespace pg_ts_config.cfgparser pg_ts_config_map.mapcfg, mapdict pg_ts_dict.dicttemplate pg_type.typcollation pg_type.typrelid regards, tom lane
Peter Geoghegan <pg@bowt.ie> writes: > On Wed, Feb 27, 2019 at 1:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> OID collision doesn't seem to be a significant problem (for me). >> Um, I beg to differ. It's not at all unusual for pending patches to >> bit-rot for no reason other than suddenly getting an OID conflict. >> I don't have to look far for a current example: >> https://travis-ci.org/postgresql-cfbot/postgresql/builds/498955351 > Patch authors shouldn't be left with any excuse for leaving their > patch to bitrot for long. And, more casual patch reviewers shouldn't > have any excuse for not downloading a patch and applying it locally, > so that they can spend a spare 10 minutes kicking the tires. Yeah, that latter point is really the killer argument. We don't want to make people spend valuable review time on fixing uninteresting OID conflicts. It's even more annoying that several people might have to duplicate the same work, if they're testing a patch independently. Given a convention that under-development patches use OIDs in the 9K range, the only time anybody would have to resolve OID conflicts for testing would be if they were trying to test the combination of two or more patches. Even then, an OID-renumbering script would make it pretty painless: apply patch 1, renumber its OIDs to someplace else, apply patch 2, repeat as needed. > Why not have unused_oids reference the convention as a "tip"? Hmm, could be helpful. regards, tom lane
On 2019-02-27 22:27, Tom Lane wrote: >> OID collision doesn't seem to be a significant problem (for me). > > Um, I beg to differ. It's not at all unusual for pending patches to > bit-rot for no reason other than suddenly getting an OID conflict. > I don't have to look far for a current example: I'm not saying it doesn't happen, but that it's not a significant problem overall. The changes of a patch (a) allocating a new OID, (b) a second patch allocating a new OID, (c) both being in flight at the same time, (d) actually picking the same OID, are small. I guess the overall time lost to this issue is perhaps 2 hours per year. On the other hand, with about 2000 commits to master per year, if this renumbering business only adds 2 seconds of overhead to committing, we're coming out behind. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-02-27 22:50, Peter Geoghegan wrote: > However, the continuous > integration stuff has created an expectation that your patch shouldn't > be left to bitrot for long. Silly mechanical bitrot now seems like a > much bigger problem than it was before these developments. It unfairly > puts reviewers off engaging. If this is the problem (although I think we'd find that OID collisions are rather rare compared to other gratuitous cfbot failures), why not have the cfbot build with a flag that ignores OID collisions? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Feb 27, 2019 at 2:39 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > The changes of a patch (a) allocating a new OID, (b) a second patch > allocating a new OID, (c) both being in flight at the same time, (d) > actually picking the same OID, are small. But...they are. Most patches don't create new system catalog entries at all. Of those that do, the conventions around assigning new OIDs make it fairly likely that problems will emerge. > I guess the overall time lost > to this issue is perhaps 2 hours per year. On the other hand, with > about 2000 commits to master per year, if this renumbering business only > adds 2 seconds of overhead to committing, we're coming out behind. The time spent on the final commit is not the cost we're concerned about, though. It isn't necessary to do that more than once, whereas all but the most trivial of patches receive multiple rounds of review and revision. -- Peter Geoghegan
On Wed, Feb 27, 2019 at 2:44 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > If this is the problem (although I think we'd find that OID collisions > are rather rare compared to other gratuitous cfbot failures), why not > have the cfbot build with a flag that ignores OID collisions? How would that work? -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Wed, Feb 27, 2019 at 2:44 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> If this is the problem (although I think we'd find that OID collisions >> are rather rare compared to other gratuitous cfbot failures), why not >> have the cfbot build with a flag that ignores OID collisions? > How would that work? It could work for conflicting OIDs in different system catalogs (so that the "conflict" is an artifact of our assignment rules rather than an intrinsic problem). But I think the majority of new hand-assigned OIDs are in pg_proc, so that this kind of hack would not help as much as one might wish. regards, tom lane
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2019-02-27 22:27, Tom Lane wrote: >>> OID collision doesn't seem to be a significant problem (for me). >> Um, I beg to differ. It's not at all unusual for pending patches to >> bit-rot for no reason other than suddenly getting an OID conflict. >> I don't have to look far for a current example: > I'm not saying it doesn't happen, but that it's not a significant > problem overall. I do not think you are correct. It may not be a big problem across all our incoming patches, but that's only because most of them don't have anything to do with hand-assigned OIDs. Among those that do, I think there is a significant problem. To try to quantify this a bit, I looked through v12-cycle and pending patches that touch the catalog data. We've only committed 12 patches adding new hand-assigned OIDs since v11 was branched off. (I suspect that's lower than in a typical cycle, but have not attempted to quantify things further back.) Of those, only two seem to have needed OID adjustments after initial posting, but that's mostly because most of them were committer-originated patches that got pushed within a week or two. That's certainly not the typical wait time for a patch submitted by anybody else. Also, a lot of these patches recycled OIDs that'd recently been freed by patches such as the abstime-ectomy, which means that the amount of OID conflict created for pending patches is probably *really* low in this cycle-so-far, compared to our historical norms. Of what's in the queue to be reviewed right now, there are just 20 (out of 150-plus) patches that touch the catalog/*.dat files. I got this number by groveling through the cfbot's reports of patch applications, to see which patches touched those files. It might omit some patches that the cfbot failed to make sense of. Also, I'm pretty sure that a few of these patches don't actually assign any new OIDs but just change existing entries, or create only entries with auto-assigned OIDs. I did not try to separate those out, however, since the point here is to estimate for how many patches a committer would even need to think about this. Of those twenty patches, three have unresolved OID conflicts right now: multivariate MCV lists and histograms commontype polymorphics log10/hyper functions Another one has recently had to resolve an OID conflict: SortSupport implementation on inet/cdir which is notable considering that that thread is less than three weeks old. (The log10 and commontype threads aren't really ancient either.) I spent some extra effort looking at the patches that both create more than a few new OIDs and have been around for awhile: Generic type subscripting KNN for btree Custom compression methods SQL/JSON: functions SQL/JSON: jsonpath Generated columns BRIN bloom indexes The first four of those have all had to reassign OIDs during their lifetime. jsonpath has avoided doing so by choosing fairly high-numbered OIDs (6K range) to begin with; which I trust you will agree is a solution that doesn't scale for long. I'm not entirely sure that the last two haven't had to renumber OIDs; I ran out of energy before poking through their history in detail. In short, this situation may look fine from the perspective of a committer with a relatively short timeline to commit, but it's pretty darn awful for everybody else. The only way to avoid a ~ 50% failure rate is to choose OIDs above 6K, and once everybody starts doing it like that, things are going to get very unpleasant very quickly. regards, tom lane
On Wed, Feb 27, 2019 at 10:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > In short, this situation may look fine from the perspective of a committer > with a relatively short timeline to commit, but it's pretty darn awful for > everybody else. The only way to avoid a ~ 50% failure rate is to choose > OIDs above 6K, and once everybody starts doing it like that, things are > going to get very unpleasant very quickly. The root problem here from the perspective of a non-committer is not that they might have to renumber OIDs a few times over the year or two it takes to get their patch merged, but rather that it takes a year or two to get their patch merged. That's not to say that I have no sympathy with people in that situation or don't want to make their lives easier, but I'm not really convinced that burdening committers with additional manual steps is the right way to get patches merged faster. This seems like a big piece of new mechanism being invented to solve an occasional annoyance. Your statistics are not convincing at all: you're arguing that this is a big problem because 2-3% of pending patches current have an issue here, and some others have in the past, but that's a really small percentage, and the time spent doing OID renumbering must be a tiny percentage of the total time anyone spends hacking on PostgreSQL. I think that the problem here is have a very limited range of OIDs (10k) which can be used for this purpose, and the number of OIDs that are used in that space is now a significant fraction of the total (>4.5k), and the problem is further complicated by the desire to keep the OIDs assigned near the low end of the available numbering space and/or near to other OIDs used for similar purposes. The sheer fact that the address space is nearly half-used means that conflicts are likely even if people choose OIDs at random, and when people choose OIDs non-randomly -- lowest, round numbers, near to other OIDs -- the chances of conflicts just go up. We could fix that problem by caring less about keeping all the numbers gapless and increasing the size of the reserved space to say 100k, but just as a thought, what if we stopped assigning manual OIDs for new catalog entries altogether, except for once at the end of each release cycle? Make a way that people can add an entry to pg_proc.dat or whatever without fixing an OID, and let the build scripts generate one. As many patches as happen during a release cycle will add new such entries and they'll just all get some OID or other. Then, at the end of the release cycle, we'll run a script that finds all of those catalog entries and rewrites the .dat files, adding a permanent OID assignment to each one, so that those OIDs will then be fixed for all future releases (unless we drop the entries or explicitly change something). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > This seems like a big piece of new mechanism being invented > to solve an occasional annoyance. Your statistics are not convincing > at all: you're arguing that this is a big problem because 2-3% of > pending patches current have an issue here, and some others have in > the past, but that's a really small percentage, and the time spent > doing OID renumbering must be a tiny percentage of the total time > anyone spends hacking on PostgreSQL. TBH, I find this position utterly baffling. It's true that only a small percentage of patches have an issue here, because only a small percentage of patches dabble in manually-assigned OIDs at all. But *among those that do*, there is a huge problem. I had not actually realized how bad it is until I gathered those stats, but it's bad. I don't understand the objection to inventing a mechanism that will help those patches and has no impact whatever when working on patches that don't involve manually-assigned OIDs. And, yeah, I'd like us not to have patches hanging around for years either, but that's a reality that's not going away. > We could fix that problem by caring less about keeping all the numbers > gapless and increasing the size of the reserved space to say 100k, We already had this discussion. Moving FirstNormalObjectId is infeasible without forcing a dump/reload, which I don't think anyone wants to do. > but > just as a thought, what if we stopped assigning manual OIDs for new > catalog entries altogether, except for once at the end of each release > cycle? And that's another idea without any basis in reality. What are you going to do instead? What mechanism will you use to track these OIDs so you can clean up later? Who's going to write the code that will support this? Not me. I think the proposal that is on the table is superior. regards, tom lane
On Thu, Feb 28, 2019 at 10:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > This seems like a big piece of new mechanism being invented > > to solve an occasional annoyance. Your statistics are not convincing > > at all: you're arguing that this is a big problem because 2-3% of > > pending patches current have an issue here, and some others have in > > the past, but that's a really small percentage, and the time spent > > doing OID renumbering must be a tiny percentage of the total time > > anyone spends hacking on PostgreSQL. > > TBH, I find this position utterly baffling. It's true that only a > small percentage of patches have an issue here, because only a small > percentage of patches dabble in manually-assigned OIDs at all. But > *among those that do*, there is a huge problem. I had not actually > realized how bad it is until I gathered those stats, but it's bad. > > I don't understand the objection to inventing a mechanism that will > help those patches and has no impact whatever when working on patches > that don't involve manually-assigned OIDs. > > And, yeah, I'd like us not to have patches hanging around for years > either, but that's a reality that's not going away. I don't think this is the worst proposal ever. However, I also think that it's not unreasonable to raise the issue that writing OR reviewing OR committing a patch already involves adhering to a thicket of undocumented rules. When somebody fails to adhere to one of those rules, they get ignored or publicly shamed. Now you want to add yet another step to the process - really two. If you want to submit a patch that requires new catalog entries, you must know that you're supposed to put those OIDs in this new range that we're going to set aside for such things, and if you want to commit one, you must know that you're suppose to renumber those OID assignments into some other range. And people are going to screw it up - submitters are going to fail to know about this new policy (which will probably be documented nowhere, just like all the other ones) - and committers are going to fail to remember to renumber things. So, I suspect that for every unit of work it saves somebody, it's probably going to generate about one unit of extra work for somebody else. A lot of projects have a much less painful process for getting patches integrated than we do. I don't know how those projects maintain adequate code quality, but I do know that making it easy to get a patch accepted makes people more likely to contribute patches, and increases overall development velocity. It is not even vaguely unreasonable to worry about whether making this more complicated is going to hurt more than it helps, and I don't know why you think otherwise. > > but > > just as a thought, what if we stopped assigning manual OIDs for new > > catalog entries altogether, except for once at the end of each release > > cycle? > > And that's another idea without any basis in reality. What are you > going to do instead? What mechanism will you use to track these > OIDs so you can clean up later? Right now every entry in pg_proc.dat includes an OID assignment. What I'm proposing is that we would also allow entries that did not have one, and the build process would assign one while processing the .dat files. Then later, somebody could use a script that went through and rewrote the .dat file to add OID assignments to any entries that lacked them. Since the topic of having tools for automated rewrite of those files has been discussed at length, and since we already have a script called reformat_dat_file.pl in the tree which contains comments indicating that it could be modified for bulk editing, said script having been committed BY YOU, I don't understand why you think that bulk editing is infeasible. > Who's going to write the code that > will support this? Not me. I think the proposal that is on the > table is superior. OK. Well, I think that doing nothing is superior to this proposal, for reasons similar to what Peter Eisentraut has already articulated. And I think rather than blasting forward with your own preferred alternative in the face of disagreement, you should be willing to discuss other possible options. But if you're not willing to do that, I can't make you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 28, 2019 at 7:59 AM Robert Haas <robertmhaas@gmail.com> wrote: > I don't think this is the worst proposal ever. However, I also think > that it's not unreasonable to raise the issue that writing OR > reviewing OR committing a patch already involves adhering to a thicket > of undocumented rules. When somebody fails to adhere to one of those > rules, they get ignored or publicly shamed. Now you want to add yet > another step to the process - really two. There does seem to be a real problem with undocumented processes. For example, I must confess that it came as news to me that we already had a reserved OID range. However, I don't think that there is that much of an issue with adding new mechanisms like this, provided it makes it easy to do the right thing and hard to do the wrong thing. What Tom has proposed so far is not something that self-evidently meets that standard, but it's also not something that self-evidently fails to meet that standard. I have attempted to institute some general guidelines for what the thicket of rules are by creating the "committing checklist" page. This is necessarily imperfect, because the rules are in many cases open to interpretation, often for good practical reasons. I don't have any sympathy for committers that find it hard to remember to do a catversion bump with any kind of regularity. That complexity seems inherent, not incidental, since it's often convenient to ignore catalog incompatibilities during development. > So, I suspect that for every > unit of work it saves somebody, it's probably going to generate about > one unit of extra work for somebody else. Maybe so. I think that you're jumping to conclusions, though. > A lot of projects have a much less painful process for getting patches > integrated than we do. I don't know how those projects maintain > adequate code quality, but I do know that making it easy to get a > patch accepted makes people more likely to contribute patches, and > increases overall development velocity. It is not even vaguely > unreasonable to worry about whether making this more complicated is > going to hurt more than it helps, and I don't know why you think > otherwise. But you seem to want to make the mechanism itself even more complicated, not less complicated (based on your remarks about making OID assignment happen during the build). In order to make the use of the mechanism easier. That seems worth considering, but ISTM that this is talking at cross purposes. There are far simpler ways of making it unlikely that a committer is going to miss this step. There is also a simple way of noticing that they do quickly (e.g. a simple buildfarm test). > Right now every entry in pg_proc.dat includes an OID assignment. What > I'm proposing is that we would also allow entries that did not have > one, and the build process would assign one while processing the .dat > files. Then later, somebody could use a script that went through and > rewrote the .dat file to add OID assignments to any entries that > lacked them. Since the topic of having tools for automated rewrite of > those files has been discussed at length, and since we already have a > script called reformat_dat_file.pl in the tree which contains > comments indicating that it could be modified for bulk editing, said > script having been committed BY YOU, I don't understand why you think > that bulk editing is infeasible. I'm also curious to hear what Tom thinks about this. > OK. Well, I think that doing nothing is superior to this proposal, > for reasons similar to what Peter Eisentraut has already articulated. > And I think rather than blasting forward with your own preferred > alternative in the face of disagreement, you should be willing to > discuss other possible options. But if you're not willing to do that, > I can't make you. Peter seemed to not want to do this on the grounds that it isn't necessary at all, whereas you think that it doesn't go far enough. If there is a consensus against what Tom has said, it's a cacophonous one that cannot really be said to be in favor of anything. -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Thu, Feb 28, 2019 at 7:59 AM Robert Haas <robertmhaas@gmail.com> wrote: >> OK. Well, I think that doing nothing is superior to this proposal, >> for reasons similar to what Peter Eisentraut has already articulated. >> And I think rather than blasting forward with your own preferred >> alternative in the face of disagreement, you should be willing to >> discuss other possible options. But if you're not willing to do that, >> I can't make you. > Peter seemed to not want to do this on the grounds that it isn't > necessary at all, whereas you think that it doesn't go far enough. If > there is a consensus against what Tom has said, it's a cacophonous one > that cannot really be said to be in favor of anything. The only thing that's really clear is that some senior committers don't want to be bothered because they don't think there's a problem here that justifies any additional expenditure of their time. Perhaps they are right, because I'd expected some comments from non-committer developers confirming that they see a problem, and the silence is deafening. I'm inclined to commit some form of Naylor's tool improvement anyway, because I have use for it; I remember times when I've renumbered OIDs manually in patches, and it wasn't much fun. But I can't force a process change if there's not consensus for it among the committers. regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: >>> just as a thought, what if we stopped assigning manual OIDs for new >>> catalog entries altogether, except for once at the end of each release >>> cycle? Actually ... that leads to an idea that wouldn't add any per-commit overhead, or really much change at all to existing processes. Given the existence of a reliable OID-renumbering tool, we could: 1. Encourage people to develop new patches using chosen-at-random high OIDs, in the 7K-9K range. They do this already, it'd just be encouraged instead of discouraged. 2. Commit patches as received. 3. Once each devel cycle, after feature freeze, somebody uses the renumbering tool to shove all the new OIDs down to lower numbers, freeing the high-OID range for the next devel cycle. We'd have to remember to do that, but it could be added to the RELEASE_CHANGES checklist. In this scheme, OID collisions are a problem for in-progress patches only if two patches are unlucky enough to choose the same random high OIDs during the same devel cycle. That's unlikely, or at least a good bit less likely than collisions are today. If/when it does happen we'd have a couple of alternatives for ameliorating the problem --- either the not-yet-committed patch could use the renumbering tool on their own OIDs, or we could do an off-schedule run of step 3 to get the already-committed OIDs out of their way. regards, tom lane
On Thu, Feb 28, 2019 at 3:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > The only thing that's really clear is that some senior committers don't > want to be bothered because they don't think there's a problem here that > justifies any additional expenditure of their time. Perhaps they are > right, because I'd expected some comments from non-committer developers > confirming that they see a problem, and the silence is deafening. I don't think that you can take that as too strong a signal. The incentives are different for non-committers. > I'm inclined to commit some form of Naylor's tool improvement anyway, > because I have use for it; I remember times when I've renumbered OIDs > manually in patches, and it wasn't much fun. But I can't force a > process change if there's not consensus for it among the committers. I think that that's a reasonable thing to do, provided there is obvious feedback that makes it highly unlikely that the committer will make an error at the last moment. I have a hard time coming up with a suggestion that won't be considered annoying by at least one person, though. Would it be awful if there was a #warning directive that kicked in when the temporary OID range is in use? It should be possible to do that without breaking -Werror builds, which I believe Robert uses (I am reminded of the Flex bug that we used to have to work around). It's not like there are that many patches that need to assign OIDs to new catalog entries. I would suggest that we put the warning in the regression tests if I didn't know that that could be missed by the use of parallel variants, where the output flies by. There is no precedent for using #warning for something like that, but offhand it seems like the only thing that would work consistently. I don't really mind having to do slightly more work when the issue crops up, especially if that means less work for everyone involved in aggregate, which is the cost that I'm concerned about the most. However, an undocumented or under-documented process that requires a fixed amount of extra mental effort when committing *anything* is another matter. -- Peter Geoghegan
On 3/1/19 12:41 AM, Peter Geoghegan wrote: > On Thu, Feb 28, 2019 at 3:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The only thing that's really clear is that some senior committers don't >> want to be bothered because they don't think there's a problem here that >> justifies any additional expenditure of their time. Perhaps they are >> right, because I'd expected some comments from non-committer developers >> confirming that they see a problem, and the silence is deafening. > > I don't think that you can take that as too strong a signal. The > incentives are different for non-committers. > >> I'm inclined to commit some form of Naylor's tool improvement anyway, >> because I have use for it; I remember times when I've renumbered OIDs >> manually in patches, and it wasn't much fun. But I can't force a >> process change if there's not consensus for it among the committers. > > I think that that's a reasonable thing to do, provided there is > obvious feedback that makes it highly unlikely that the committer will > make an error at the last moment. I have a hard time coming up with a > suggestion that won't be considered annoying by at least one person, > though. > FWIW I personally would not mind if such tool / process was added. But I have a related question - do we have some sort of list of such processes that I could check? That is, a list of stuff that is expected to be done by a committer before a commit? I do recall we have [1], but perhaps we have something else. https://wiki.postgresql.org/wiki/Committing_checklist regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Feb 28, 2019 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > >>> just as a thought, what if we stopped assigning manual OIDs for new > >>> catalog entries altogether, except for once at the end of each release > >>> cycle? > > Actually ... that leads to an idea that wouldn't add any per-commit > overhead, or really much change at all to existing processes. Given > the existence of a reliable OID-renumbering tool, we could: > In this scheme, OID collisions are a problem for in-progress patches > only if two patches are unlucky enough to choose the same random > high OIDs during the same devel cycle. That's unlikely, or at least > a good bit less likely than collisions are today. That sounds like a reasonable compromise. Perhaps the unused_oids script could give specific guidance on using a randomly determined small range of contiguous OIDs that fall within the current range for that devel cycle. That would prevent collisions caused by the natural human tendency to prefer a round number. Having contiguous OIDs for the same patch seems worth preserving. -- Peter Geoghegan
On Thu, Feb 28, 2019 at 6:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > 1. Encourage people to develop new patches using chosen-at-random > high OIDs, in the 7K-9K range. They do this already, it'd just > be encouraged instead of discouraged. > > 2. Commit patches as received. > > 3. Once each devel cycle, after feature freeze, somebody uses the > renumbering tool to shove all the new OIDs down to lower numbers, > freeing the high-OID range for the next devel cycle. We'd have > to remember to do that, but it could be added to the RELEASE_CHANGES > checklist. Sure, that sounds nice. It seems like it might be slightly less convenient for non-committers than what I was proposing, but still more convenient than what they're doing right now. And it's also more convenient for committers, because they're not being asked to manually fiddle patches at the last moment, something that I at least find rather error-prone. It also, and I think this is really good, moves in the direction of fewer things for both patch authors and patch committers to worry about doing wrong. Instead of throwing rocks at people whose OID assignments are "wrong," we just accept what people do and adjust it later if it makes sense to do so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 28, 2019 at 5:36 PM Peter Geoghegan <pg@bowt.ie> wrote: > I have attempted to institute some general guidelines for what the > thicket of rules are by creating the "committing checklist" page. This > is necessarily imperfect, because the rules are in many cases open to > interpretation, often for good practical reasons. I don't have any > sympathy for committers that find it hard to remember to do a > catversion bump with any kind of regularity. That complexity seems > inherent, not incidental, since it's often convenient to ignore > catalog incompatibilities during development. Bumping catversion is something of a special case, because one does it so often that one gets used to remembering it. The rules are usually not too hard to remember, although they are trickier when you don't directly change anything in src/include/catalog but just change the definition of some node that may be serialized in a catalog someplace. It would be neat if there were a tool you could run to somehow tell you whether catversion needs to be changed for a given patch. But you know there are a log of other version numbers floating around, like XLOG_PAGE_MAGIC or the pg_dump archive version, and it is not really that easy to know -- as a new contributor or sometimes even as an experienced one -- whether your work requires any changes to that stuff, or even that that stuff *exists*. Indeed, XLOG_PAGE_MAGIC is a particularly annoying case, both because the constant name doesn't contain VERSION and because the comment just says /* can be used as WAL version indicator */ which does not exactly make it clear that if you fail to bump it when you touch the WAL format you will Make People Unhappy. Indeed, Simon got complaints a number of years ago (2010, it looks like) when he had the temerity to change the magic number to some other unrelated value instead of just incrementing it by one. Although I think that the criticism was to a certain extent well-founded -- why deviate from previous practice? -- there is at the same time something a little crazy about somebody getting excited about the particular value that has been chosen for a number that is described in the very name of the constant as a MAGIC number. And especially because there is absolutely zip in the way of code comments or a README that explain to you how to do it "right." > > So, I suspect that for every > > unit of work it saves somebody, it's probably going to generate about > > one unit of extra work for somebody else. > > Maybe so. I think that you're jumping to conclusions, though. I did say "I suspect..." which was intended as a concession that I don't know for sure. > But you seem to want to make the mechanism itself even more > complicated, not less complicated (based on your remarks about making > OID assignment happen during the build). In order to make the use of > the mechanism easier. That seems worth considering, but ISTM that this > is talking at cross purposes. There are far simpler ways of making it > unlikely that a committer is going to miss this step. There is also a > simple way of noticing that they do quickly (e.g. a simple buildfarm > test). Well, perhaps I'm proposing some additional code, but I don't think of that as making the mechanism more complicated. I want to make it simpler for patch submitters and reviewers and committers to not make mistakes that they have to run around and fix. If there are fewer kinds of things that qualify as mistakes, as in Tom's latest proposal, then we are moving in the right direction IMO regardless of anything else. > > OK. Well, I think that doing nothing is superior to this proposal, > > for reasons similar to what Peter Eisentraut has already articulated. > > And I think rather than blasting forward with your own preferred > > alternative in the face of disagreement, you should be willing to > > discuss other possible options. But if you're not willing to do that, > > I can't make you. > > Peter seemed to not want to do this on the grounds that it isn't > necessary at all, whereas you think that it doesn't go far enough. If > there is a consensus against what Tom has said, it's a cacophonous one > that cannot really be said to be in favor of anything. I think Peter and I are more agreeing than we are at the opposite ends of a spectrum, but more importantly, I think it is worth having a discussion first about what people like and dislike, and what goals they have, and then only if necessary, counting the votes afterwards. I don't like having the feeling that because I have a different view of something and want to write an email about that, I am somehow an impediment to progress. I think if we reduce discussions to you're-for-it-or-your-against-it, that's not that helpful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 1, 2019 at 11:56 AM Robert Haas <robertmhaas@gmail.com> wrote: > It would be neat if there were a tool you could run to somehow tell > you whether catversion needs to be changed for a given patch. That seems infeasible because of stored rules. A lot of things bleed into that. We could certainly do better at documenting this on the "committing checklist" page, though. > Indeed, Simon got complaints a number of years ago (2010, it looks > like) when he had the temerity to change the magic number to some > other unrelated value instead of just incrementing it by one. Off with his head! > Although I think that the criticism was to a certain extent > well-founded -- why deviate from previous practice? -- there is at the > same time something a little crazy about somebody getting excited > about the particular value that has been chosen for a number that is > described in the very name of the constant as a MAGIC number. And > especially because there is absolutely zip in the way of code comments > or a README that explain to you how to do it "right." I have learned to avoid ambiguity more than anything else, because ambiguity causes patches to flounder indefinitely, whereas it's usually not that hard to fix something that's broken. I agree - anything that adds ambiguity rather than taking it away is a big problem. > Well, perhaps I'm proposing some additional code, but I don't think of > that as making the mechanism more complicated. I want to make it > simpler for patch submitters and reviewers and committers to not make > mistakes that they have to run around and fix. Right. So do I. I just don't think that it's that bad to ask the final committer to do something once, rather than getting everyone else (including committers) to do it multiple times. If we can avoid even this burden, and totally centralize the management of the OID space, then so much the better. > If there are fewer > kinds of things that qualify as mistakes, as in Tom's latest proposal, > then we are moving in the right direction IMO regardless of anything > else. I'm glad that we now have a plan that is a clear step forward. > I think Peter and I are more agreeing than we are at the opposite ends > of a spectrum, but more importantly, I think it is worth having a > discussion first about what people like and dislike, and what goals > they have, and then only if necessary, counting the votes afterwards. I agree that that's totally worthwhile. > I don't like having the feeling that because I have a different view > of something and want to write an email about that, I am somehow an > impediment to progress. I think if we reduce discussions to > you're-for-it-or-your-against-it, that's not that helpful. That was not my intention. The way that you brought the issue of the difficulty of being a contributor in general into it was unhelpful, though. It didn't seem useful or fair to link Tom's position to a big, well known controversy. We now have a solution that everyone is happy with, or can at least live with, which suggests to me that Tom wasn't being intransigent or insensitive to the concerns of contributors. -- Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes: > On Fri, Mar 1, 2019 at 11:56 AM Robert Haas <robertmhaas@gmail.com> wrote: >> It would be neat if there were a tool you could run to somehow tell >> you whether catversion needs to be changed for a given patch. > That seems infeasible because of stored rules. A lot of things bleed > into that. We could certainly do better at documenting this on the > "committing checklist" page, though. A first approximation to that is "did you touch readfuncs.c", though that rule will give a false positive if you only changed Plan nodes. regards, tom lane
John Naylor <john.naylor@2ndquadrant.com> writes: >> On 2/8/19, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> A script such as you suggest might be a good way to reduce the temptation >>> to get lazy at the last minute. Now that the catalog data is pretty >>> machine-readable, I suspect it wouldn't be very hard --- though I'm >>> not volunteering either. I'm envisioning something simple like "renumber >>> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the >>> ability to skip any already-used OIDs in the target range. >> This might be something that can be done inside reformat_dat_files.pl. >> It's a little outside it's scope, but better than the alternatives. > Along those lines, here's a draft patch to do just that. It handles > array type oids as well. Run it like this: > perl reformat_dat_file.pl --map-from 9000 --map-to 2000 *.dat I took a quick look at this. I went ahead and pushed the parts that were just code cleanup in reformat_dat_file.pl, since that seemed pretty uncontroversial. As far as the rest of it goes: * I'm really not terribly happy with sticking this functionality into reformat_dat_file.pl. First, there's an issue of discoverability: it's not obvious that a script named that way would have such an ability. Second, it clutters the script in a way that seems to me to hinder its usefulness as a basis for one-off hacks. So I'd really rather have a separate script named something like "renumber_oids.pl", even if there's a good deal of code duplication between it and reformat_dat_file.pl. * In my vision of what this might be good for, I think it's important that it be possible to specify a range of input OIDs to renumber, not just "everything above N". I agree the output range only needs a starting OID. BTW, I changed the CF entry's target back to v12; I don't see a reason not to get this done this month, and indeed kind of wish it was available right now ;-) regards, tom lane
On Sat, Mar 9, 2019 at 1:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I took a quick look at this. I went ahead and pushed the parts that > were just code cleanup in reformat_dat_file.pl, since that seemed > pretty uncontroversial. As far as the rest of it goes: Okay, thanks. > * I'm really not terribly happy with sticking this functionality into > reformat_dat_file.pl. First, there's an issue of discoverability: > it's not obvious that a script named that way would have such an > ability. Second, it clutters the script in a way that seems to me > to hinder its usefulness as a basis for one-off hacks. So I'd really > rather have a separate script named something like "renumber_oids.pl", > even if there's a good deal of code duplication between it and > reformat_dat_file.pl. > * In my vision of what this might be good for, I think it's important > that it be possible to specify a range of input OIDs to renumber, not > just "everything above N". I agree the output range only needs a > starting OID. Now it looks like: perl renumber_oids.pl --first-mapped-oid 8000 --last-mapped-oid 8999 --first-target-oid 2000 *.dat To prevent a maintenance headache, I didn't copy any of the formatting logic over. You'll also have to run reformat_dat_files.pl afterwards to restore that. It seems to work, but I haven't tested thoroughly. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
John Naylor <john.naylor@2ndquadrant.com> writes: > Now it looks like: > perl renumber_oids.pl --first-mapped-oid 8000 --last-mapped-oid 8999 > --first-target-oid 2000 *.dat > To prevent a maintenance headache, I didn't copy any of the formatting > logic over. You'll also have to run reformat_dat_files.pl afterwards > to restore that. It seems to work, but I haven't tested thoroughly. I didn't like the use of Data::Dumper, because it made it quite impossible to check what the script had done by eyeball. After some thought I concluded that we could probably just apply the changes via search-and-replace, which is pretty ugly and low-tech but it leads to easily diffable results, whether or not the initial state is exactly what reformat_dat_files would produce. I also changed things so that the OID mapping is computed before we start changing any files, because as it stood the objects would get renumbered in a pretty random order; and I renamed one of the switches so they all have unique one-letter abbreviations. Experimenting with this, I realized that it couldn't renumber OIDs that are defined in .h files rather than .dat files, which is a serious deficiency, but given the search-and-replace implementation it's not too hard to fix up the .h files as well. So I did that, and removed the expectation that the target files would be listed on the command line; that seems more likely to be a foot-gun than to do anything useful. I've successfully done check-world after renumbering every OID above 4000 to somewhere else. I also tried renumbering everything below 4000, which unsurprisingly blew up because there are various catalog columns we haven't fixed to use symbolic OIDs. (The one that initdb immediately trips over is pg_database.dattablespace.) I'm not sure if it's worth the trouble to make that totally clean, but I suspect we ought to at least mop up text-search references to be symbolic. That's material for a separate patch though. This seems committable from my end --- any further comments? regards, tom lane diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml index 3c5830b..b6038b9 100644 --- a/doc/src/sgml/bki.sgml +++ b/doc/src/sgml/bki.sgml @@ -388,8 +388,25 @@ to see which ones do not appear. You can also use the <filename>duplicate_oids</filename> script to check for mistakes. (<filename>genbki.pl</filename> will assign OIDs for any rows that - didn't get one hand-assigned to them and also detect duplicate OIDs - at compile time.) + didn't get one hand-assigned to them, and it will also detect duplicate + OIDs at compile time.) + </para> + + <para> + When choosing OIDs for a patch that is not expected to be committed + immediately, best practice is to use a group of more-or-less + consecutive OIDs starting with some random choice in the range + 8000—9999. This minimizes the risk of OID collisions with other + patches being developed concurrently. To keep the 8000—9999 + range free for development purposes, after a patch has been committed + to the master git repository its OIDs should be renumbered into + available space below that range. Typically, this will be done + near the end of each development cycle, moving all OIDs consumed by + patches committed in that cycle at the same time. The script + <filename>renumber_oids.pl</filename> can be used for this purpose. + If an uncommitted patch is found to have OID conflicts with some + recently-committed patch, <filename>renumber_oids.pl</filename> may + also be useful for recovering from that situation. </para> <para> diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 3bf308f..368b1de 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -87,6 +87,8 @@ sub ParseHeader } # Push the data into the appropriate data structure. + # Caution: when adding new recognized OID-defining macros, + # also update src/include/catalog/renumber_oids.pl. if (/^DECLARE_TOAST\(\s*(\w+),\s*(\d+),\s*(\d+)\)/) { push @{ $catalog{toasting} }, diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h index 833fad1..f253613 100644 --- a/src/include/catalog/indexing.h +++ b/src/include/catalog/indexing.h @@ -4,6 +4,9 @@ * This file provides some definitions to support indexing * on system catalogs * + * Caution: all #define's with numeric values in this file had better be + * object OIDs, else renumber_oids.pl might change them inappropriately. + * * * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California diff --git a/src/include/catalog/renumber_oids.pl b/src/include/catalog/renumber_oids.pl new file mode 100755 index 0000000..8a07340 --- /dev/null +++ b/src/include/catalog/renumber_oids.pl @@ -0,0 +1,291 @@ +#!/usr/bin/perl +#---------------------------------------------------------------------- +# +# renumber_oids.pl +# Perl script that shifts a range of OIDs in the Postgres catalog data +# to a different range, skipping any OIDs that are already in use. +# +# Note: This does not reformat the .dat files, so you may want +# to run reformat_dat_file.pl afterwards. +# +# Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group +# Portions Copyright (c) 1994, Regents of the University of California +# +# src/include/catalog/renumber_oids.pl +# +#---------------------------------------------------------------------- + +use strict; +use warnings; + +use FindBin; +use Getopt::Long; + +# Must run in src/include/catalog +chdir $FindBin::RealBin or die "could not cd to $FindBin::RealBin: $!\n"; + +use lib "$FindBin::RealBin/../../backend/catalog/"; +use Catalog; + +# We'll need this number. +my $FirstGenbkiObjectId = + Catalog::FindDefinedSymbol('access/transam.h', '..', 'FirstGenbkiObjectId'); + +# Process command line switches. +my $output_path = ''; +my $first_mapped_oid = 0; +my $last_mapped_oid = $FirstGenbkiObjectId - 1; +my $target_oid = 0; + +GetOptions( + 'output=s' => \$output_path, + 'first-mapped-oid=i' => \$first_mapped_oid, + 'last-mapped-oid=i' => \$last_mapped_oid, + 'target-oid=i' => \$target_oid) || usage(); + +# Sanity check arguments. +die "Unexpected non-switch arguments.\n" if @ARGV; +die "--first-mapped-oid must be specified.\n" + if $first_mapped_oid <= 0; +die "Empty mapped OID range.\n" + if $last_mapped_oid < $first_mapped_oid; +die "--target-oid must be specified.\n" + if $target_oid <= 0; +die "--target-oid must not be within mapped OID range.\n" + if $target_oid >= $first_mapped_oid && $target_oid <= $last_mapped_oid; + +# Make sure output_path ends in a slash. +if ($output_path ne '' && substr($output_path, -1) ne '/') +{ + $output_path .= '/'; +} + +# Collect all the existing assigned OIDs (including those to be remapped). +my @header_files = (glob("pg_*.h"), qw(indexing.h toasting.h)); +my $oids = Catalog::FindAllOidsFromHeaders(@header_files); + +# Hash-ify the existing OIDs for convenient lookup. +my %oidhash; +@oidhash{@$oids} = undef; + +# Select new OIDs for existing OIDs in the mapped range. +# We do this first so that we preserve the ordering of the mapped OIDs +# (for reproducibility's sake), and so that if we fail due to running out +# of OID room, that happens before we've overwritten any files. +my %maphash; +my $next_oid = $target_oid; + +for ( + my $mapped_oid = $first_mapped_oid; + $mapped_oid <= $last_mapped_oid; + $mapped_oid++) +{ + next if !exists $oidhash{$mapped_oid}; + $next_oid++ + while ( + exists $oidhash{$next_oid} + || ( $next_oid >= $first_mapped_oid + && $next_oid <= $last_mapped_oid)); + die "Reached FirstGenbkiObjectId before assigning all OIDs.\n" + if $next_oid >= $FirstGenbkiObjectId; + $maphash{$mapped_oid} = $next_oid; + $next_oid++; +} + +die "There are no OIDs in the mapped range.\n" if $next_oid == $target_oid; + +# Read each .h file and write out modified data. +foreach my $input_file (@header_files) +{ + $input_file =~ /(\w+)\.h$/ + or die "Input file $input_file needs to be a .h file.\n"; + my $catname = $1; + + # Ignore generated *_d.h files. + next if $catname =~ /_d$/; + + open(my $ifd, '<', $input_file) || die "$input_file: $!"; + + # Write output files to specified directory. + # Use a .tmp suffix, then rename into place, in case we're overwriting. + my $output_file = "$output_path$catname.h"; + my $tmp_output_file = "$output_file.tmp"; + open my $ofd, '>', $tmp_output_file + or die "can't open $tmp_output_file: $!"; + my $changed = 0; + + # Scan the input file. + while (<$ifd>) + { + my $line = $_; + + # Check for OID-defining macros that Catalog::ParseHeader knows about, + # and update OIDs as needed. + if ($line =~ m/^(DECLARE_TOAST\(\s*\w+,\s*)(\d+)(,\s*)(\d+)\)/) + { + my $oid2 = $2; + my $oid4 = $4; + if (exists $maphash{$oid2}) + { + $oid2 = $maphash{$oid2}; + my $repl = $1 . $oid2 . $3 . $oid4 . ")"; + $line =~ s/^DECLARE_TOAST\(\s*\w+,\s*\d+,\s*\d+\)/$repl/; + $changed = 1; + } + if (exists $maphash{$oid4}) + { + $oid4 = $maphash{$oid4}; + my $repl = $1 . $oid2 . $3 . $oid4 . ")"; + $line =~ s/^DECLARE_TOAST\(\s*\w+,\s*\d+,\s*\d+\)/$repl/; + $changed = 1; + } + } + elsif ( + $line =~ m/^(DECLARE_(UNIQUE_)?INDEX\(\s*\w+,\s*)(\d+)(,\s*.+)\)/) + { + if (exists $maphash{$3}) + { + my $repl = $1 . $maphash{$3} . $4 . ")"; + $line =~ + s/^DECLARE_(UNIQUE_)?INDEX\(\s*\w+,\s*\d+,\s*.+\)/$repl/; + $changed = 1; + } + } + elsif ($line =~ m/^CATALOG\((\w+),(\d+),(\w+)\)/) + { + if (exists $maphash{$2}) + { + my $repl = + "CATALOG(" . $1 . "," . $maphash{$2} . "," . $3 . ")"; + $line =~ s/^CATALOG\(\w+,\d+,\w+\)/$repl/; + $changed = 1; + } + + if ($line =~ m/BKI_ROWTYPE_OID\((\d+),(\w+)\)/) + { + if (exists $maphash{$1}) + { + my $repl = + "BKI_ROWTYPE_OID(" . $maphash{$1} . "," . $2 . ")"; + $line =~ s/BKI_ROWTYPE_OID\(\d+,\w+\)/$repl/; + $changed = 1; + } + } + } + + # In indexing.h and toasting.h only, check for #define SYM nnnn, + # and replace if within mapped range. + elsif ($line =~ m/^(\s*#\s*define\s+\w+\s+)(\d+)\b/) + { + if (($catname eq 'indexing' || $catname eq 'toasting') + && exists $maphash{$2}) + { + my $repl = $1 . $maphash{$2}; + $line =~ s/^\s*#\s*define\s+\w+\s+\d+\b/$repl/; + $changed = 1; + } + } + + print $ofd $line; + } + + close $ifd; + close $ofd; + + # Avoid updating files if we didn't change them. + if ($changed || $output_path ne '') + { + rename $tmp_output_file, $output_file + or die "can't rename $tmp_output_file to $output_file: $!"; + } + else + { + unlink $tmp_output_file + or die "can't unlink $tmp_output_file: $!"; + } +} + +# Likewise, read each .dat file and write out modified data. +foreach my $input_file (glob("pg_*.dat")) +{ + $input_file =~ /(\w+)\.dat$/ + or die "Input file $input_file needs to be a .dat file.\n"; + my $catname = $1; + + open(my $ifd, '<', $input_file) || die "$input_file: $!"; + + # Write output files to specified directory. + # Use a .tmp suffix, then rename into place, in case we're overwriting. + my $output_file = "$output_path$catname.dat"; + my $tmp_output_file = "$output_file.tmp"; + open my $ofd, '>', $tmp_output_file + or die "can't open $tmp_output_file: $!"; + my $changed = 0; + + # Scan the input file. + while (<$ifd>) + { + my $line = $_; + + # Check for oid => 'nnnn', and replace if within mapped range. + if ($line =~ m/\b(oid\s*=>\s*)'(\d+)'/) + { + if (exists $maphash{$2}) + { + my $repl = $1 . "'" . $maphash{$2} . "'"; + $line =~ s/\boid\s*=>\s*'\d+'/$repl/; + $changed = 1; + } + } + + # Likewise for array_type_oid. + if ($line =~ m/\b(array_type_oid\s*=>\s*)'(\d+)'/) + { + if (exists $maphash{$2}) + { + my $repl = $1 . "'" . $maphash{$2} . "'"; + $line =~ s/\barray_type_oid\s*=>\s*'\d+'/$repl/; + $changed = 1; + } + } + + print $ofd $line; + } + + close $ifd; + close $ofd; + + # Avoid updating files if we didn't change them. + if ($changed || $output_path ne '') + { + rename $tmp_output_file, $output_file + or die "can't rename $tmp_output_file to $output_file: $!"; + } + else + { + unlink $tmp_output_file + or die "can't unlink $tmp_output_file: $!"; + } +} + +sub usage +{ + my $last = $FirstGenbkiObjectId - 1; + die <<EOM; +Usage: renumber_oids.pl [--output PATH] --first-mapped-oid X [--last-mapped-oid Y] --target-oid Z + +Options: + --output PATH output directory (default '.') + --first-mapped-oid X first OID to be moved + --last-mapped-oid Y last OID to be moved (default $last) + --target-oid Z first OID to move to + +Catalog *.h and *.dat files are updated and written to the +output directory; by default, this overwrites the input files. + +Caution: the output PATH will be interpreted relative to +src/include/catalog, even if you start the script +in some other directory. + +EOM +} diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h index 3796971..5ee628c 100644 --- a/src/include/catalog/toasting.h +++ b/src/include/catalog/toasting.h @@ -3,6 +3,9 @@ * toasting.h * This file provides some definitions to support creation of toast tables * + * Caution: all #define's with numeric values in this file had better be + * object OIDs, else renumber_oids.pl might change them inappropriately. + * * * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California diff --git a/src/tools/RELEASE_CHANGES b/src/tools/RELEASE_CHANGES index edfd56e..920c6e9 100644 --- a/src/tools/RELEASE_CHANGES +++ b/src/tools/RELEASE_CHANGES @@ -58,11 +58,23 @@ in both HEAD and the branch. o doc/src/sgml/ref manual pages * Ports - o update config.guess and config.sub at the start of beta - (from http://savannah.gnu.org/projects/config) o update ports list in doc/src/sgml/installation.sgml o update platform-specific FAQ's, if needed + +Pre-Beta Tasks +============== + +These things should be done at least once per development cycle. +Typically we do them between feature freeze and start of beta test, +but there may be reasons to do them at other times as well. + +* Renumber any manually-assigned OIDs between 8000 and 9999 + to lower numbers, using renumber_oids.pl (see notes in bki.sgml) + +* Update config.guess and config.sub + (from http://savannah.gnu.org/projects/config) + * Update inet/cidr data types with newest Bind patches
I wrote: > I've successfully done check-world after renumbering every OID above > 4000 to somewhere else. I also tried renumbering everything below > 4000, which unsurprisingly blew up because there are various catalog > columns we haven't fixed to use symbolic OIDs. (The one that initdb > immediately trips over is pg_database.dattablespace.) I'm not sure > if it's worth the trouble to make that totally clean, but I suspect > we ought to at least mop up text-search references to be symbolic. > That's material for a separate patch though. So I couldn't resist poking at that, and after a couple hours' work I have the attached patch, which removes all remaining hard-wired OID references in the .dat files. Using this, I renumbered all the OIDs in include/catalog, and behold things pretty much worked. I got through check-world after hacking up these points: * Unsurprisingly, there are lots of regression tests that have object OIDs hard-wired in queries and/or expected output. * initdb.c has a couple of places that know that template1 has OID 1. * information_schema.sql has several SQL-language functions that hard-wire the OIDs of assorted built-in types. I'm not particularly fussed about the first two points, but the last is a bit worrisome. It's not too hard to imagine somebody adding knowledge of their new type to those functions, and the code getting broken by a renumbering pass, and us not noticing if the point isn't stressed by a regression test (which mostly those functions aren't). We could imagine fixing those functions along the lines of CASE WHEN $2 = -1 /* default typmod */ THEN null - WHEN $1 IN (1042, 1043) /* char, varchar */ + WHEN $1 IN ('pg_catalog.bpchar'::pg_catalog.regtype, + 'pg_catalog.varchar'::pg_catalog.regtype) THEN $2 - 4 which would add some parsing overhead, but I'm not sure if anyone would notice that. regards, tom lane diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 10c2b24..ce159ae 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -174,6 +174,20 @@ foreach my $row (@{ $catalog_data{pg_am} }) $amoids{ $row->{amname} } = $row->{oid}; } +# class (relation) OID lookup (note this only covers bootstrap catalogs!) +my %classoids; +foreach my $row (@{ $catalog_data{pg_class} }) +{ + $classoids{ $row->{relname} } = $row->{oid}; +} + +# collation OID lookup +my %collationoids; +foreach my $row (@{ $catalog_data{pg_collation} }) +{ + $collationoids{ $row->{collname} } = $row->{oid}; +} + # language OID lookup my %langoids; foreach my $row (@{ $catalog_data{pg_language} }) @@ -243,6 +257,41 @@ foreach my $row (@{ $catalog_data{pg_proc} }) } } +# tablespace OID lookup +my %tablespaceoids; +foreach my $row (@{ $catalog_data{pg_tablespace} }) +{ + $tablespaceoids{ $row->{spcname} } = $row->{oid}; +} + +# text search configuration OID lookup +my %tsconfigoids; +foreach my $row (@{ $catalog_data{pg_ts_config} }) +{ + $tsconfigoids{ $row->{cfgname} } = $row->{oid}; +} + +# text search dictionary OID lookup +my %tsdictoids; +foreach my $row (@{ $catalog_data{pg_ts_dict} }) +{ + $tsdictoids{ $row->{dictname} } = $row->{oid}; +} + +# text search parser OID lookup +my %tsparseroids; +foreach my $row (@{ $catalog_data{pg_ts_parser} }) +{ + $tsparseroids{ $row->{prsname} } = $row->{oid}; +} + +# text search template OID lookup +my %tstemplateoids; +foreach my $row (@{ $catalog_data{pg_ts_template} }) +{ + $tstemplateoids{ $row->{tmplname} } = $row->{oid}; +} + # type lookups my %typeoids; my %types; @@ -287,14 +336,21 @@ close $ef; # Map lookup name to the corresponding hash table. my %lookup_kind = ( - pg_am => \%amoids, - pg_language => \%langoids, - pg_opclass => \%opcoids, - pg_operator => \%operoids, - pg_opfamily => \%opfoids, - pg_proc => \%procoids, - pg_type => \%typeoids, - encoding => \%encids); + pg_am => \%amoids, + pg_class => \%classoids, + pg_collation => \%collationoids, + pg_language => \%langoids, + pg_opclass => \%opcoids, + pg_operator => \%operoids, + pg_opfamily => \%opfoids, + pg_proc => \%procoids, + pg_tablespace => \%tablespaceoids, + pg_ts_config => \%tsconfigoids, + pg_ts_dict => \%tsdictoids, + pg_ts_parser => \%tsparseroids, + pg_ts_template => \%tstemplateoids, + pg_type => \%typeoids, + encoding => \%encids); # Open temp files @@ -730,8 +786,8 @@ sub morph_row_for_pgattr $row->{attndims} = $type->{typcategory} eq 'A' ? '1' : '0'; # collation-aware catalog columns must use C collation - $row->{attcollation} = $type->{typcollation} != 0 ? - $C_COLLATION_OID : 0; + $row->{attcollation} = + $type->{typcollation} ne '0' ? $C_COLLATION_OID : 0; if (defined $attr->{forcenotnull}) { diff --git a/src/include/catalog/pg_class.dat b/src/include/catalog/pg_class.dat index 5a1f3aa..511d876 100644 --- a/src/include/catalog/pg_class.dat +++ b/src/include/catalog/pg_class.dat @@ -21,48 +21,44 @@ # similarly, "1" in relminmxid stands for FirstMultiXactId { oid => '1247', - relname => 'pg_type', relnamespace => 'PGNSP', reltype => '71', - reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM', - relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0', - relallvisible => '0', reltoastrelid => '0', relhasindex => 'f', - relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '31', - relchecks => '0', relhasrules => 'f', relhastriggers => 'f', - relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f', - relispopulated => 't', relreplident => 'n', relispartition => 'f', - relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_', + relname => 'pg_type', reltype => 'pg_type', relam => 'PGHEAPAM', + relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0', + reltoastrelid => '0', relhasindex => 'f', relisshared => 'f', + relpersistence => 'p', relkind => 'r', relnatts => '31', relchecks => '0', + relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f', + relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't', + relreplident => 'n', relispartition => 'f', relrewrite => '0', + relfrozenxid => '3', relminmxid => '1', relacl => '_null_', reloptions => '_null_', relpartbound => '_null_' }, { oid => '1249', - relname => 'pg_attribute', relnamespace => 'PGNSP', reltype => '75', - reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM', - relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0', - relallvisible => '0', reltoastrelid => '0', relhasindex => 'f', - relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '24', - relchecks => '0', relhasrules => 'f', relhastriggers => 'f', - relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f', - relispopulated => 't', relreplident => 'n', relispartition => 'f', - relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_', + relname => 'pg_attribute', reltype => 'pg_attribute', relam => 'PGHEAPAM', + relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0', + reltoastrelid => '0', relhasindex => 'f', relisshared => 'f', + relpersistence => 'p', relkind => 'r', relnatts => '24', relchecks => '0', + relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f', + relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't', + relreplident => 'n', relispartition => 'f', relrewrite => '0', + relfrozenxid => '3', relminmxid => '1', relacl => '_null_', reloptions => '_null_', relpartbound => '_null_' }, { oid => '1255', - relname => 'pg_proc', relnamespace => 'PGNSP', reltype => '81', - reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM', - relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0', - relallvisible => '0', reltoastrelid => '0', relhasindex => 'f', - relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '29', - relchecks => '0', relhasrules => 'f', relhastriggers => 'f', - relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f', - relispopulated => 't', relreplident => 'n', relispartition => 'f', - relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_', + relname => 'pg_proc', reltype => 'pg_proc', relam => 'PGHEAPAM', + relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0', + reltoastrelid => '0', relhasindex => 'f', relisshared => 'f', + relpersistence => 'p', relkind => 'r', relnatts => '29', relchecks => '0', + relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f', + relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't', + relreplident => 'n', relispartition => 'f', relrewrite => '0', + relfrozenxid => '3', relminmxid => '1', relacl => '_null_', reloptions => '_null_', relpartbound => '_null_' }, { oid => '1259', - relname => 'pg_class', relnamespace => 'PGNSP', reltype => '83', - reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM', - relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0', - relallvisible => '0', reltoastrelid => '0', relhasindex => 'f', - relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '33', - relchecks => '0', relhasrules => 'f', relhastriggers => 'f', - relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f', - relispopulated => 't', relreplident => 'n', relispartition => 'f', - relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_', + relname => 'pg_class', reltype => 'pg_class', relam => 'PGHEAPAM', + relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0', + reltoastrelid => '0', relhasindex => 'f', relisshared => 'f', + relpersistence => 'p', relkind => 'r', relnatts => '33', relchecks => '0', + relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f', + relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't', + relreplident => 'n', relispartition => 'f', relrewrite => '0', + relfrozenxid => '3', relminmxid => '1', relacl => '_null_', reloptions => '_null_', relpartbound => '_null_' }, ] diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h index ad698c9..d88703d 100644 --- a/src/include/catalog/pg_class.h +++ b/src/include/catalog/pg_class.h @@ -28,56 +28,113 @@ */ CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,RelationRelation_Rowtype_Id) BKI_SCHEMA_MACRO { - Oid oid; /* oid */ - NameData relname; /* class name */ - Oid relnamespace; /* OID of namespace containing this class */ - Oid reltype; /* OID of entry in pg_type for table's - * implicit row type */ - Oid reloftype; /* OID of entry in pg_type for underlying - * composite type */ - Oid relowner; /* class owner */ - Oid relam; /* access method; 0 if not a table / index */ - Oid relfilenode; /* identifier of physical storage file */ + /* oid */ + Oid oid; + /* class name */ + NameData relname; + + /* OID of namespace containing this class */ + Oid relnamespace BKI_DEFAULT(PGNSP); + + /* OID of entry in pg_type for table's implicit row type */ + Oid reltype BKI_LOOKUP(pg_type); + + /* OID of entry in pg_type for underlying composite type */ + Oid reloftype BKI_DEFAULT(0) BKI_LOOKUP(pg_type); + + /* class owner */ + Oid relowner BKI_DEFAULT(PGUID); + + /* access method; 0 if not a table / index */ + Oid relam; + + /* identifier of physical storage file */ /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */ - Oid reltablespace; /* identifier of table space for relation */ - int32 relpages; /* # of blocks (not always up-to-date) */ - float4 reltuples; /* # of tuples (not always up-to-date) */ - int32 relallvisible; /* # of all-visible blocks (not always - * up-to-date) */ - Oid reltoastrelid; /* OID of toast table; 0 if none */ - bool relhasindex; /* T if has (or has had) any indexes */ - bool relisshared; /* T if shared across databases */ - char relpersistence; /* see RELPERSISTENCE_xxx constants below */ - char relkind; /* see RELKIND_xxx constants below */ - int16 relnatts; /* number of user attributes */ + Oid relfilenode; + + /* identifier of table space for relation (0 means default for database) */ + Oid reltablespace BKI_DEFAULT(0) BKI_LOOKUP(pg_tablespace); + + /* # of blocks (not always up-to-date) */ + int32 relpages; + + /* # of tuples (not always up-to-date) */ + float4 reltuples; + + /* # of all-visible blocks (not always up-to-date) */ + int32 relallvisible; + + /* OID of toast table; 0 if none */ + Oid reltoastrelid; + + /* T if has (or has had) any indexes */ + bool relhasindex; + + /* T if shared across databases */ + bool relisshared; + + /* see RELPERSISTENCE_xxx constants below */ + char relpersistence; + + /* see RELKIND_xxx constants below */ + char relkind; + + /* number of user attributes */ + int16 relnatts; /* * Class pg_attribute must contain exactly "relnatts" user attributes * (with attnums ranging from 1 to relnatts) for this class. It may also * contain entries with negative attnums for system attributes. */ - int16 relchecks; /* # of CHECK constraints for class */ - bool relhasrules; /* has (or has had) any rules */ - bool relhastriggers; /* has (or has had) any TRIGGERs */ - bool relhassubclass; /* has (or has had) child tables or indexes */ - bool relrowsecurity; /* row security is enabled or not */ - bool relforcerowsecurity; /* row security forced for owners or - * not */ - bool relispopulated; /* matview currently holds query results */ - char relreplident; /* see REPLICA_IDENTITY_xxx constants */ - bool relispartition; /* is relation a partition? */ - Oid relrewrite; /* heap for rewrite during DDL, link to - * original rel */ - TransactionId relfrozenxid; /* all Xids < this are frozen in this rel */ - TransactionId relminmxid; /* all multixacts in this rel are >= this. - * this is really a MultiXactId */ + + /* # of CHECK constraints for class */ + int16 relchecks; + + /* has (or has had) any rules */ + bool relhasrules; + + /* has (or has had) any TRIGGERs */ + bool relhastriggers; + + /* has (or has had) child tables or indexes */ + bool relhassubclass; + + /* row security is enabled or not */ + bool relrowsecurity; + + /* row security forced for owners or not */ + bool relforcerowsecurity; + + /* matview currently holds query results */ + bool relispopulated; + + /* see REPLICA_IDENTITY_xxx constants */ + char relreplident; + + /* is relation a partition? */ + bool relispartition; + + /* heap for rewrite during DDL, link to original rel */ + Oid relrewrite; + + /* all Xids < this are frozen in this rel */ + TransactionId relfrozenxid; + + /* all multixacts in this rel are >= this; it is really a MultiXactId */ + TransactionId relminmxid; #ifdef CATALOG_VARLEN /* variable-length fields start here */ /* NOTE: These fields are not present in a relcache entry's rd_rel field. */ - aclitem relacl[1]; /* access permissions */ - text reloptions[1]; /* access-method-specific options */ - pg_node_tree relpartbound; /* partition bound node tree */ + /* access permissions */ + aclitem relacl[1]; + + /* access-method-specific options */ + text reloptions[1]; + + /* partition bound node tree */ + pg_node_tree relpartbound; #endif } FormData_pg_class; diff --git a/src/include/catalog/pg_database.dat b/src/include/catalog/pg_database.dat index cbd91bc..89bd75d 100644 --- a/src/include/catalog/pg_database.dat +++ b/src/include/catalog/pg_database.dat @@ -14,10 +14,9 @@ { oid => '1', oid_symbol => 'TemplateDbOid', descr => 'default template for new databases', - datname => 'template1', datdba => 'PGUID', encoding => 'ENCODING', - datcollate => 'LC_COLLATE', datctype => 'LC_CTYPE', datistemplate => 't', - datallowconn => 't', datconnlimit => '-1', datlastsysoid => '0', - datfrozenxid => '0', datminmxid => '1', dattablespace => '1663', - datacl => '_null_' }, + datname => 'template1', encoding => 'ENCODING', datcollate => 'LC_COLLATE', + datctype => 'LC_CTYPE', datistemplate => 't', datallowconn => 't', + datconnlimit => '-1', datlastsysoid => '0', datfrozenxid => '0', + datminmxid => '1', dattablespace => 'pg_default', datacl => '_null_' }, ] diff --git a/src/include/catalog/pg_database.h b/src/include/catalog/pg_database.h index 63e8efa..06fea45 100644 --- a/src/include/catalog/pg_database.h +++ b/src/include/catalog/pg_database.h @@ -28,22 +28,48 @@ */ CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248,DatabaseRelation_Rowtype_Id) BKI_SCHEMA_MACRO { - Oid oid; /* oid */ - NameData datname; /* database name */ - Oid datdba; /* owner of database */ - int32 encoding; /* character encoding */ - NameData datcollate; /* LC_COLLATE setting */ - NameData datctype; /* LC_CTYPE setting */ - bool datistemplate; /* allowed as CREATE DATABASE template? */ - bool datallowconn; /* new connections allowed? */ - int32 datconnlimit; /* max connections allowed (-1=no limit) */ - Oid datlastsysoid; /* highest OID to consider a system OID */ - TransactionId datfrozenxid; /* all Xids < this are frozen in this DB */ - TransactionId datminmxid; /* all multixacts in the DB are >= this */ - Oid dattablespace; /* default table space for this DB */ + /* oid */ + Oid oid; + + /* database name */ + NameData datname; + + /* owner of database */ + Oid datdba BKI_DEFAULT(PGUID); + + /* character encoding */ + int32 encoding; + + /* LC_COLLATE setting */ + NameData datcollate; + + /* LC_CTYPE setting */ + NameData datctype; + + /* allowed as CREATE DATABASE template? */ + bool datistemplate; + + /* new connections allowed? */ + bool datallowconn; + + /* max connections allowed (-1=no limit) */ + int32 datconnlimit; + + /* highest OID to consider a system OID */ + Oid datlastsysoid; + + /* all Xids < this are frozen in this DB */ + TransactionId datfrozenxid; + + /* all multixacts in the DB are >= this */ + TransactionId datminmxid; + + /* default table space for this DB */ + Oid dattablespace BKI_LOOKUP(pg_tablespace); #ifdef CATALOG_VARLEN /* variable-length fields start here */ - aclitem datacl[1]; /* access permissions */ + /* access permissions */ + aclitem datacl[1]; #endif } FormData_pg_database; diff --git a/src/include/catalog/pg_ts_config.dat b/src/include/catalog/pg_ts_config.dat index 47d2342..bddaa81 100644 --- a/src/include/catalog/pg_ts_config.dat +++ b/src/include/catalog/pg_ts_config.dat @@ -13,7 +13,6 @@ [ { oid => '3748', descr => 'simple configuration', - cfgname => 'simple', cfgnamespace => 'PGNSP', cfgowner => 'PGUID', - cfgparser => '3722' }, + cfgname => 'simple', cfgparser => 'default' }, ] diff --git a/src/include/catalog/pg_ts_config.h b/src/include/catalog/pg_ts_config.h index 4531c3e..7ab97a8 100644 --- a/src/include/catalog/pg_ts_config.h +++ b/src/include/catalog/pg_ts_config.h @@ -29,11 +29,20 @@ */ CATALOG(pg_ts_config,3602,TSConfigRelationId) { - Oid oid; /* oid */ - NameData cfgname; /* name of configuration */ - Oid cfgnamespace; /* name space */ - Oid cfgowner; /* owner */ - Oid cfgparser; /* OID of parser (in pg_ts_parser) */ + /* oid */ + Oid oid; + + /* name of configuration */ + NameData cfgname; + + /* name space */ + Oid cfgnamespace BKI_DEFAULT(PGNSP); + + /* owner */ + Oid cfgowner BKI_DEFAULT(PGUID); + + /* OID of parser */ + Oid cfgparser BKI_LOOKUP(pg_ts_parser); } FormData_pg_ts_config; typedef FormData_pg_ts_config *Form_pg_ts_config; diff --git a/src/include/catalog/pg_ts_config_map.dat b/src/include/catalog/pg_ts_config_map.dat index 3ce4e16..43a8bd4 100644 --- a/src/include/catalog/pg_ts_config_map.dat +++ b/src/include/catalog/pg_ts_config_map.dat @@ -12,24 +12,43 @@ [ -{ mapcfg => '3748', maptokentype => '1', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '2', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '3', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '4', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '5', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '6', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '7', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '8', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '9', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '10', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '11', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '15', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '16', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '17', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '18', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '19', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '20', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '21', mapseqno => '1', mapdict => '3765' }, -{ mapcfg => '3748', maptokentype => '22', mapseqno => '1', mapdict => '3765' }, +{ mapcfg => 'simple', maptokentype => '1', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '2', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '3', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '4', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '5', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '6', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '7', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '8', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '9', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '10', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '11', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '15', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '16', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '17', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '18', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '19', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '20', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '21', mapseqno => '1', + mapdict => 'simple' }, +{ mapcfg => 'simple', maptokentype => '22', mapseqno => '1', + mapdict => 'simple' }, ] diff --git a/src/include/catalog/pg_ts_config_map.h b/src/include/catalog/pg_ts_config_map.h index dd2de48..7892e17 100644 --- a/src/include/catalog/pg_ts_config_map.h +++ b/src/include/catalog/pg_ts_config_map.h @@ -29,10 +29,17 @@ */ CATALOG(pg_ts_config_map,3603,TSConfigMapRelationId) { - Oid mapcfg; /* OID of configuration owning this entry */ - int32 maptokentype; /* token type from parser */ - int32 mapseqno; /* order in which to consult dictionaries */ - Oid mapdict; /* dictionary to consult */ + /* OID of configuration owning this entry */ + Oid mapcfg BKI_LOOKUP(pg_ts_config); + + /* token type from parser */ + int32 maptokentype; + + /* order in which to consult dictionaries */ + int32 mapseqno; + + /* dictionary to consult */ + Oid mapdict BKI_LOOKUP(pg_ts_dict); } FormData_pg_ts_config_map; typedef FormData_pg_ts_config_map *Form_pg_ts_config_map; diff --git a/src/include/catalog/pg_ts_dict.dat b/src/include/catalog/pg_ts_dict.dat index cda413a..f9d50da 100644 --- a/src/include/catalog/pg_ts_dict.dat +++ b/src/include/catalog/pg_ts_dict.dat @@ -14,7 +14,6 @@ { oid => '3765', descr => 'simple dictionary: just lower case and check for stopword', - dictname => 'simple', dictnamespace => 'PGNSP', dictowner => 'PGUID', - dicttemplate => '3727', dictinitoption => '_null_' }, + dictname => 'simple', dicttemplate => 'simple', dictinitoption => '_null_' }, ] diff --git a/src/include/catalog/pg_ts_dict.h b/src/include/catalog/pg_ts_dict.h index b77c422..be7f016 100644 --- a/src/include/catalog/pg_ts_dict.h +++ b/src/include/catalog/pg_ts_dict.h @@ -28,14 +28,24 @@ */ CATALOG(pg_ts_dict,3600,TSDictionaryRelationId) { - Oid oid; /* oid */ - NameData dictname; /* dictionary name */ - Oid dictnamespace; /* name space */ - Oid dictowner; /* owner */ - Oid dicttemplate; /* dictionary's template */ + /* oid */ + Oid oid; + + /* dictionary name */ + NameData dictname; + + /* name space */ + Oid dictnamespace BKI_DEFAULT(PGNSP); + + /* owner */ + Oid dictowner BKI_DEFAULT(PGUID); + + /* dictionary's template */ + Oid dicttemplate BKI_LOOKUP(pg_ts_template); #ifdef CATALOG_VARLEN /* variable-length fields start here */ - text dictinitoption; /* options passed to dict_init() */ + /* options passed to dict_init() */ + text dictinitoption; #endif } FormData_pg_ts_dict; diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat index d129583..2495ed6 100644 --- a/src/include/catalog/pg_type.dat +++ b/src/include/catalog/pg_type.dat @@ -54,7 +54,7 @@ typname => 'name', typlen => 'NAMEDATALEN', typbyval => 'f', typcategory => 'S', typelem => 'char', typinput => 'namein', typoutput => 'nameout', typreceive => 'namerecv', typsend => 'namesend', - typalign => 'c', typcollation => '950' }, + typalign => 'c', typcollation => 'C' }, { oid => '20', array_type_oid => '1016', descr => '~18 digit integer, 8-byte storage', typname => 'int8', typlen => '8', typbyval => 'FLOAT8PASSBYVAL', @@ -85,7 +85,7 @@ typname => 'text', typlen => '-1', typbyval => 'f', typcategory => 'S', typispreferred => 't', typinput => 'textin', typoutput => 'textout', typreceive => 'textrecv', typsend => 'textsend', typalign => 'i', - typstorage => 'x', typcollation => '100' }, + typstorage => 'x', typcollation => 'default' }, { oid => '26', array_type_oid => '1028', descr => 'object identifier(oid), maximum 4 billion', typname => 'oid', typlen => '4', typbyval => 't', typcategory => 'N', @@ -115,22 +115,22 @@ # NB: OIDs assigned here must match the BKI_ROWTYPE_OID declarations { oid => '71', typname => 'pg_type', typlen => '-1', typbyval => 'f', typtype => 'c', - typcategory => 'C', typrelid => '1247', typinput => 'record_in', + typcategory => 'C', typrelid => 'pg_type', typinput => 'record_in', typoutput => 'record_out', typreceive => 'record_recv', typsend => 'record_send', typalign => 'd', typstorage => 'x' }, { oid => '75', typname => 'pg_attribute', typlen => '-1', typbyval => 'f', typtype => 'c', - typcategory => 'C', typrelid => '1249', typinput => 'record_in', + typcategory => 'C', typrelid => 'pg_attribute', typinput => 'record_in', typoutput => 'record_out', typreceive => 'record_recv', typsend => 'record_send', typalign => 'd', typstorage => 'x' }, { oid => '81', typname => 'pg_proc', typlen => '-1', typbyval => 'f', typtype => 'c', - typcategory => 'C', typrelid => '1255', typinput => 'record_in', + typcategory => 'C', typrelid => 'pg_proc', typinput => 'record_in', typoutput => 'record_out', typreceive => 'record_recv', typsend => 'record_send', typalign => 'd', typstorage => 'x' }, { oid => '83', typname => 'pg_class', typlen => '-1', typbyval => 'f', typtype => 'c', - typcategory => 'C', typrelid => '1259', typinput => 'record_in', + typcategory => 'C', typrelid => 'pg_class', typinput => 'record_in', typoutput => 'record_out', typreceive => 'record_recv', typsend => 'record_send', typalign => 'd', typstorage => 'x' }, @@ -150,21 +150,21 @@ typcategory => 'S', typinput => 'pg_node_tree_in', typoutput => 'pg_node_tree_out', typreceive => 'pg_node_tree_recv', typsend => 'pg_node_tree_send', typalign => 'i', typstorage => 'x', - typcollation => '100' }, + typcollation => 'default' }, { oid => '3361', oid_symbol => 'PGNDISTINCTOID', descr => 'multivariate ndistinct coefficients', typname => 'pg_ndistinct', typlen => '-1', typbyval => 'f', typcategory => 'S', typinput => 'pg_ndistinct_in', typoutput => 'pg_ndistinct_out', typreceive => 'pg_ndistinct_recv', typsend => 'pg_ndistinct_send', typalign => 'i', typstorage => 'x', - typcollation => '100' }, + typcollation => 'default' }, { oid => '3402', oid_symbol => 'PGDEPENDENCIESOID', descr => 'multivariate dependencies', typname => 'pg_dependencies', typlen => '-1', typbyval => 'f', typcategory => 'S', typinput => 'pg_dependencies_in', typoutput => 'pg_dependencies_out', typreceive => 'pg_dependencies_recv', typsend => 'pg_dependencies_send', typalign => 'i', typstorage => 'x', - typcollation => '100' }, + typcollation => 'default' }, { oid => '32', oid_symbol => 'PGDDLCOMMANDOID', descr => 'internal type for passing CollectedCommand', typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't', @@ -269,14 +269,14 @@ typinput => 'bpcharin', typoutput => 'bpcharout', typreceive => 'bpcharrecv', typsend => 'bpcharsend', typmodin => 'bpchartypmodin', typmodout => 'bpchartypmodout', typalign => 'i', typstorage => 'x', - typcollation => '100' }, + typcollation => 'default' }, { oid => '1043', array_type_oid => '1015', descr => 'varchar(length), non-blank-padded string, variable storage length', typname => 'varchar', typlen => '-1', typbyval => 'f', typcategory => 'S', typinput => 'varcharin', typoutput => 'varcharout', typreceive => 'varcharrecv', typsend => 'varcharsend', typmodin => 'varchartypmodin', typmodout => 'varchartypmodout', - typalign => 'i', typstorage => 'x', typcollation => '100' }, + typalign => 'i', typstorage => 'x', typcollation => 'default' }, { oid => '1082', array_type_oid => '1182', descr => 'date', typname => 'date', typlen => '4', typbyval => 't', typcategory => 'D', typinput => 'date_in', typoutput => 'date_out', typreceive => 'date_recv', diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h index 24d114b..4a24739 100644 --- a/src/include/catalog/pg_type.h +++ b/src/include/catalog/pg_type.h @@ -99,7 +99,7 @@ CATALOG(pg_type,1247,TypeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71,TypeRelati char typdelim BKI_DEFAULT(','); /* associated pg_class OID if a composite type, else 0 */ - Oid typrelid BKI_DEFAULT(0) BKI_ARRAY_DEFAULT(0); + Oid typrelid BKI_DEFAULT(0) BKI_ARRAY_DEFAULT(0) BKI_LOOKUP(pg_class); /* * If typelem is not 0 then it identifies another row in pg_type. The @@ -215,7 +215,7 @@ CATALOG(pg_type,1247,TypeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71,TypeRelati * DEFAULT_COLLATION_OID) for collatable base types, possibly some other * OID for domains over collatable types */ - Oid typcollation BKI_DEFAULT(0); + Oid typcollation BKI_DEFAULT(0) BKI_LOOKUP(pg_collation); #ifdef CATALOG_VARLEN /* variable-length fields start here */
On Tue, Mar 12, 2019 at 5:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > This seems committable from my end --- any further comments? I gave it a read and it looks good to me, but I haven't tried to run it. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
John Naylor <john.naylor@2ndquadrant.com> writes: > On Tue, Mar 12, 2019 at 5:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This seems committable from my end --- any further comments? > I gave it a read and it looks good to me, but I haven't tried to run it. Thanks for checking. I've pushed both patches now. I noticed while looking at the pg_class data that someone had stuck in a hack to make genbki.pl substitute for "PGHEAPAM", which AFAICS is just following the bad old precedent of PGNSP and PGUID. I got rid of that in favor of using the already-existing BKI_LOOKUP(pg_am) mechanism. Maybe someday we should try to get rid of PGNSP and PGUID too, although there are stumbling blocks in the way of both: * PGNSP is also substituted for in the bodies of some SQL procedures. * Replacing PGUID with the actual name of the bootstrap superuser is a bit problematic because that name isn't necessarily "postgres". We could probably make it work, but I'm not convinced it'd be any less confusing than the existing special-case behavior is. Anyway I think we're basically done here. There's some additional cleanup that could possibly be done, like removing the hard-wired references to OID 1 in initdb.c. But I'm having a hard time convincing myself that it's worth the trouble, except maybe for the question of information_schema.sql's hard-wired type OIDs. Even there, it's certainly possible for a patch to use a regtype constant even if the existing code doesn't. regards, tom lane