Thread: Why don't we have a small reserved OID range for patch revisions?

Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
Why don't we provide a small reserved OID range that can be used by
patch authors temporarily, with the expectation that they'll be
replaced by "real" OIDs at the point the patch gets committed? This
would be similar the situation with catversion bumps -- we don't
expect patches that will eventually need them to have them.

It's considered good practice to choose an OID that's at the beginning
of the range shown by the unused_oids script, so naturally there is a
good chance that any patch that adds a system catalog entry will bit
rot prematurely. This seems totally unnecessary to me. You could even
have a replace_oids script under this system. That would replace the
known-temporary OIDs with mapped contiguous real values at the time of
commit (maybe it would just print out which permanent OIDs to use in
place of the temporary ones, and leave the rest up to the committer).
I don't do Perl, so I'm not volunteering for this.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> Why don't we provide a small reserved OID range that can be used by
> patch authors temporarily, with the expectation that they'll be
> replaced by "real" OIDs at the point the patch gets committed? This
> would be similar the situation with catversion bumps -- we don't
> expect patches that will eventually need them to have them.

Quite a few people have used OIDs up around 8000 or 9000 for this purpose;
I doubt we need a formally reserved range for it.  The main problem with
doing it is the hazard that the patch'll get committed just like that,
suddenly breaking things for everyone else doing likewise.

(I would argue, in fact, that the reason we have any preassigned OIDs
above perhaps 6000 is that exactly this has happened before.)

A script such as you suggest might be a good way to reduce the temptation
to get lazy at the last minute.  Now that the catalog data is pretty
machine-readable, I suspect it wouldn't be very hard --- though I'm
not volunteering either.  I'm envisioning something simple like "renumber
all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the
ability to skip any already-used OIDs in the target range.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Fri, Feb 8, 2019 at 10:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> A script such as you suggest might be a good way to reduce the temptation
> to get lazy at the last minute.  Now that the catalog data is pretty
> machine-readable, I suspect it wouldn't be very hard --- though I'm
> not volunteering either.  I'm envisioning something simple like "renumber
> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the
> ability to skip any already-used OIDs in the target range.

I imagined that the machine-readable catalog data would allow us to
assign non-numeric identifiers to this OID range. Perhaps there'd be a
textual symbol with a number in the range of 0-20 at the end. Those
would stick out like a sore thumb, making it highly unlikely that
anybody would forget about it at the last minute.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Fri, Feb 8, 2019 at 10:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A script such as you suggest might be a good way to reduce the temptation
>> to get lazy at the last minute.  Now that the catalog data is pretty
>> machine-readable, I suspect it wouldn't be very hard --- though I'm
>> not volunteering either.  I'm envisioning something simple like "renumber
>> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the
>> ability to skip any already-used OIDs in the target range.

> I imagined that the machine-readable catalog data would allow us to
> assign non-numeric identifiers to this OID range. Perhaps there'd be a
> textual symbol with a number in the range of 0-20 at the end. Those
> would stick out like a sore thumb, making it highly unlikely that
> anybody would forget about it at the last minute.

Um.  That would not be just an add-on script but something that
genbki.pl would have to accept.  I'm not excited about that; it would
complicate what's already complex, and if it works enough for test
purposes then it wouldn't really stop a committer who wasn't paying
attention from committing the patch un-revised.

To the extent that this works at all, OIDs in the 9000 range ought
to be enough of a flag already, I think.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Fri, Feb 8, 2019 at 10:29 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Um.  That would not be just an add-on script but something that
> genbki.pl would have to accept.  I'm not excited about that; it would
> complicate what's already complex, and if it works enough for test
> purposes then it wouldn't really stop a committer who wasn't paying
> attention from committing the patch un-revised.
>
> To the extent that this works at all, OIDs in the 9000 range ought
> to be enough of a flag already, I think.

I tend to agree that this isn't enough of a problem to justify making
genbki.pl significantly more complicated.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Robert Haas
Date:
On Fri, Feb 8, 2019 at 11:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> To the extent that this works at all, OIDs in the 9000 range ought
> to be enough of a flag already, I think.

A "flag" that isn't documented anywhere outside of a mailing list
discussion and that isn't checked by any code anywhere is not much of
a flag, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Why don't we have a small reserved OID range for patch revisions?

From
John Naylor
Date:
On 2/8/19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> A script such as you suggest might be a good way to reduce the temptation
> to get lazy at the last minute.  Now that the catalog data is pretty
> machine-readable, I suspect it wouldn't be very hard --- though I'm
> not volunteering either.  I'm envisioning something simple like "renumber
> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the
> ability to skip any already-used OIDs in the target range.

This might be something that can be done inside reformat_dat_files.pl.
It's a little outside it's scope, but better than the alternatives.
And we already have a function in Catalog.pm to get the currently used
oids. I'll volunteer to look into it but I don't know when that will
be.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why don't we have a small reserved OID range for patch revisions?

From
John Naylor
Date:
I wrote:

> On 2/8/19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A script such as you suggest might be a good way to reduce the temptation
>> to get lazy at the last minute.  Now that the catalog data is pretty
>> machine-readable, I suspect it wouldn't be very hard --- though I'm
>> not volunteering either.  I'm envisioning something simple like "renumber
>> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the
>> ability to skip any already-used OIDs in the target range.
>
> This might be something that can be done inside reformat_dat_files.pl.
> It's a little outside it's scope, but better than the alternatives.

Along those lines, here's a draft patch to do just that. It handles
array type oids as well. Run it like this:

perl  reformat_dat_file.pl  --map-from 9000  --map-to 2000  *.dat

There is some attempt at documentation. So far it doesn't map by
default, but that could be changed if we agreed on the convention of
9000 or whatever.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Why don't we have a small reserved OID range for patch revisions?

From
John Naylor
Date:
I wrote:

> Along those lines, here's a draft patch to do just that. It handles
> array type oids as well. Run it like this:
>
> perl  reformat_dat_file.pl  --map-from 9000  --map-to 2000  *.dat
>
> There is some attempt at documentation. So far it doesn't map by
> default, but that could be changed if we agreed on the convention of
> 9000 or whatever.

In case we don't want to lose track of this, I added it to the March
commitfest with a target of v13. (I didn't see a way to  add it to the
July commitfest)

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Eisentraut
Date:
On 2019-02-08 19:14, Tom Lane wrote:
> Quite a few people have used OIDs up around 8000 or 9000 for this purpose;
> I doubt we need a formally reserved range for it.  The main problem with
> doing it is the hazard that the patch'll get committed just like that,
> suddenly breaking things for everyone else doing likewise.

For that reason, I'm not in favor of this.  Forgetting to update the
catversion is already common enough (for me).  Adding another step
between having a seemingly final patch and being able to actually commit
it doesn't seem attractive.  Moreover, these "final adjustments" would
tend to require a full rebuild and retest, adding even more overhead.

OID collision doesn't seem to be a significant problem (for me).

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 2019-02-08 19:14, Tom Lane wrote:
>> Quite a few people have used OIDs up around 8000 or 9000 for this purpose;
>> I doubt we need a formally reserved range for it.  The main problem with
>> doing it is the hazard that the patch'll get committed just like that,
>> suddenly breaking things for everyone else doing likewise.

> For that reason, I'm not in favor of this.  Forgetting to update the
> catversion is already common enough (for me).  Adding another step
> between having a seemingly final patch and being able to actually commit
> it doesn't seem attractive.  Moreover, these "final adjustments" would
> tend to require a full rebuild and retest, adding even more overhead.

> OID collision doesn't seem to be a significant problem (for me).

Um, I beg to differ.  It's not at all unusual for pending patches to
bit-rot for no reason other than suddenly getting an OID conflict.
I don't have to look far for a current example:
https://travis-ci.org/postgresql-cfbot/postgresql/builds/498955351

We do need a couple of pieces of new infrastructure to make this idea
conveniently workable.  One is a tool to allow automatic OID renumbering
instead of having to do it by hand; Naylor has a draft for that upthread.

Perhaps it'd be useful for genbki.pl to spit out a warning (NOT an
error) if it sees OIDs in the reserved range.  I'm not sure that that'd
really be worth the trouble though, since one could easily forget
about it while reviewing/testing just before commit, and it'd just be
useless noise up until it was time to commit.

Another issue, as Robert pointed out, is that this does need to be
a formal convention not something undocumented.  Naylor's patch adds
a mention of it in bki.sgml, but I wonder if anyplace else should
talk about it.

I concede your point that a prudent committer would do a rebuild and
retest rather than just trusting the tool.  But really, how much
extra work is that?  If you've spent any time optimizing your workflow,
a full rebuild and check-world should be under five minutes on any
hardware anyone would be using for development today.

And, yeah, we probably will make mistakes like this, just like we
sometimes forget the catversion bump.  As long as we have a tool
for OID renumbering, I don't think that's the end of the world.
Fixing it after the fact isn't going to be a big deal, any more
than it is for catversion.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Wed, Feb 27, 2019 at 1:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > OID collision doesn't seem to be a significant problem (for me).
>
> Um, I beg to differ.  It's not at all unusual for pending patches to
> bit-rot for no reason other than suddenly getting an OID conflict.
> I don't have to look far for a current example:
> https://travis-ci.org/postgresql-cfbot/postgresql/builds/498955351

OID conflicts are not that big of a deal when building a patch
locally, because almost everyone knows what the exact problem is
immediately, and because you probably have more than a passing
interest in the patch to even do that much. However, the continuous
integration stuff has created an expectation that your patch shouldn't
be left to bitrot for long. Silly mechanical bitrot now seems like a
much bigger problem than it was before these developments. It unfairly
puts reviewers off engaging.

Patch authors shouldn't be left with any excuse for leaving their
patch to bitrot for long. And, more casual patch reviewers shouldn't
have any excuse for not downloading a patch and applying it locally,
so that they can spend a spare 10 minutes kicking the tires.

> Perhaps it'd be useful for genbki.pl to spit out a warning (NOT an
> error) if it sees OIDs in the reserved range.  I'm not sure that that'd
> really be worth the trouble though, since one could easily forget
> about it while reviewing/testing just before commit, and it'd just be
> useless noise up until it was time to commit.

My sense is that we should err on the side of being informative.

> Another issue, as Robert pointed out, is that this does need to be
> a formal convention not something undocumented.  Naylor's patch adds
> a mention of it in bki.sgml, but I wonder if anyplace else should
> talk about it.

Why not have unused_oids reference the convention as a "tip"?

> I concede your point that a prudent committer would do a rebuild and
> retest rather than just trusting the tool.  But really, how much
> extra work is that?  If you've spent any time optimizing your workflow,
> a full rebuild and check-world should be under five minutes on any
> hardware anyone would be using for development today.

If you use the "check-world parallel" recipe on the committing
checklist Wiki page, and if you use ccache, ~2 minutes is attainable
for optimized builds (though the recipe doesn't work on all release
branches). I don't think that a committer should be a committing
anything if they're not willing to do this much. It's not just prudent
-- it is the *bare minimum* when committing a patch that creates
system catalog entries.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
I wrote:
> We do need a couple of pieces of new infrastructure to make this idea
> conveniently workable.  One is a tool to allow automatic OID renumbering
> instead of having to do it by hand; Naylor has a draft for that upthread.

Oh: arguably, something else we'd need to do to ensure that OID
renumbering is trouble-free is to institute a strict rule that OID
references in the *.dat files must be symbolic.  We had not bothered
to convert every single reference type before, reasoning that some
of them were too little-used to be worth the trouble; but someday
that'll rise up to bite us, if semi-automated renumbering becomes
a thing.

It looks to me like the following OID columns remain unconverted:

pg_class.reltype
pg_database.dattablespace
pg_ts_config.cfgparser
pg_ts_config_map.mapcfg, mapdict
pg_ts_dict.dicttemplate
pg_type.typcollation
pg_type.typrelid

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Wed, Feb 27, 2019 at 1:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> OID collision doesn't seem to be a significant problem (for me).

>> Um, I beg to differ.  It's not at all unusual for pending patches to
>> bit-rot for no reason other than suddenly getting an OID conflict.
>> I don't have to look far for a current example:
>> https://travis-ci.org/postgresql-cfbot/postgresql/builds/498955351

> Patch authors shouldn't be left with any excuse for leaving their
> patch to bitrot for long. And, more casual patch reviewers shouldn't
> have any excuse for not downloading a patch and applying it locally,
> so that they can spend a spare 10 minutes kicking the tires.

Yeah, that latter point is really the killer argument.  We don't want
to make people spend valuable review time on fixing uninteresting OID
conflicts.  It's even more annoying that several people might have to
duplicate the same work, if they're testing a patch independently.

Given a convention that under-development patches use OIDs in the 9K
range, the only time anybody would have to resolve OID conflicts for
testing would be if they were trying to test the combination of two
or more patches.  Even then, an OID-renumbering script would make it
pretty painless: apply patch 1, renumber its OIDs to someplace else,
apply patch 2, repeat as needed.

> Why not have unused_oids reference the convention as a "tip"?

Hmm, could be helpful.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Eisentraut
Date:
On 2019-02-27 22:27, Tom Lane wrote:
>> OID collision doesn't seem to be a significant problem (for me).
> 
> Um, I beg to differ.  It's not at all unusual for pending patches to
> bit-rot for no reason other than suddenly getting an OID conflict.
> I don't have to look far for a current example:

I'm not saying it doesn't happen, but that it's not a significant
problem overall.

The changes of a patch (a) allocating a new OID, (b) a second patch
allocating a new OID, (c) both being in flight at the same time, (d)
actually picking the same OID, are small.  I guess the overall time lost
to this issue is perhaps 2 hours per year.  On the other hand, with
about 2000 commits to master per year, if this renumbering business only
adds 2 seconds of overhead to committing, we're coming out behind.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Eisentraut
Date:
On 2019-02-27 22:50, Peter Geoghegan wrote:
> However, the continuous
> integration stuff has created an expectation that your patch shouldn't
> be left to bitrot for long. Silly mechanical bitrot now seems like a
> much bigger problem than it was before these developments. It unfairly
> puts reviewers off engaging.

If this is the problem (although I think we'd find that OID collisions
are rather rare compared to other gratuitous cfbot failures), why not
have the cfbot build with a flag that ignores OID collisions?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Wed, Feb 27, 2019 at 2:39 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> The changes of a patch (a) allocating a new OID, (b) a second patch
> allocating a new OID, (c) both being in flight at the same time, (d)
> actually picking the same OID, are small.

But...they are. Most patches don't create new system catalog entries
at all. Of those that do, the conventions around assigning new OIDs
make it fairly likely that problems will emerge.

> I guess the overall time lost
> to this issue is perhaps 2 hours per year.  On the other hand, with
> about 2000 commits to master per year, if this renumbering business only
> adds 2 seconds of overhead to committing, we're coming out behind.

The time spent on the final commit is not the cost we're concerned
about, though. It isn't necessary to do that more than once, whereas
all but the most trivial of patches receive multiple rounds of review
and revision.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Wed, Feb 27, 2019 at 2:44 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> If this is the problem (although I think we'd find that OID collisions
> are rather rare compared to other gratuitous cfbot failures), why not
> have the cfbot build with a flag that ignores OID collisions?

How would that work?

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Wed, Feb 27, 2019 at 2:44 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> If this is the problem (although I think we'd find that OID collisions
>> are rather rare compared to other gratuitous cfbot failures), why not
>> have the cfbot build with a flag that ignores OID collisions?

> How would that work?

It could work for conflicting OIDs in different system catalogs (so that
the "conflict" is an artifact of our assignment rules rather than an
intrinsic problem).  But I think the majority of new hand-assigned OIDs
are in pg_proc, so that this kind of hack would not help as much as one
might wish.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 2019-02-27 22:27, Tom Lane wrote:
>>> OID collision doesn't seem to be a significant problem (for me).

>> Um, I beg to differ.  It's not at all unusual for pending patches to
>> bit-rot for no reason other than suddenly getting an OID conflict.
>> I don't have to look far for a current example:

> I'm not saying it doesn't happen, but that it's not a significant
> problem overall.

I do not think you are correct.  It may not be a big problem across
all our incoming patches, but that's only because most of them don't
have anything to do with hand-assigned OIDs.  Among those that do,
I think there is a significant problem.

To try to quantify this a bit, I looked through v12-cycle and pending
patches that touch the catalog data.

We've only committed 12 patches adding new hand-assigned OIDs since v11
was branched off.  (I suspect that's lower than in a typical cycle,
but have not attempted to quantify things further back.)  Of those,
only two seem to have needed OID adjustments after initial posting,
but that's mostly because most of them were committer-originated patches
that got pushed within a week or two.  That's certainly not the typical
wait time for a patch submitted by anybody else.  Also, a lot of these
patches recycled OIDs that'd recently been freed by patches such as the
abstime-ectomy, which means that the amount of OID conflict created for
pending patches is probably *really* low in this cycle-so-far, compared
to our historical norms.

Of what's in the queue to be reviewed right now, there are just
20 (out of 150-plus) patches that touch the catalog/*.dat files.
I got this number by groveling through the cfbot's reports of
patch applications, to see which patches touched those files.
It might omit some patches that the cfbot failed to make sense of.
Also, I'm pretty sure that a few of these patches don't actually
assign any new OIDs but just change existing entries, or create
only entries with auto-assigned OIDs.  I did not try to separate
those out, however, since the point here is to estimate for how
many patches a committer would even need to think about this.

Of those twenty patches, three have unresolved OID conflicts
right now:

multivariate MCV lists and histograms
commontype polymorphics
log10/hyper functions

Another one has recently had to resolve an OID conflict:

SortSupport implementation on inet/cdir    

which is notable considering that that thread is less than three weeks
old.  (The log10 and commontype threads aren't really ancient either.)

I spent some extra effort looking at the patches that both create more
than a few new OIDs and have been around for awhile:

Generic type subscripting
KNN for btree
Custom compression methods
SQL/JSON: functions
SQL/JSON: jsonpath
Generated columns
BRIN bloom indexes

The first four of those have all had to reassign OIDs during their
lifetime.  jsonpath has avoided doing so by choosing fairly high-numbered
OIDs (6K range) to begin with; which I trust you will agree is a solution
that doesn't scale for long.  I'm not entirely sure that the last two
haven't had to renumber OIDs; I ran out of energy before poking through
their history in detail.

In short, this situation may look fine from the perspective of a committer
with a relatively short timeline to commit, but it's pretty darn awful for
everybody else.  The only way to avoid a ~ 50% failure rate is to choose
OIDs above 6K, and once everybody starts doing it like that, things are
going to get very unpleasant very quickly.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Robert Haas
Date:
On Wed, Feb 27, 2019 at 10:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In short, this situation may look fine from the perspective of a committer
> with a relatively short timeline to commit, but it's pretty darn awful for
> everybody else.  The only way to avoid a ~ 50% failure rate is to choose
> OIDs above 6K, and once everybody starts doing it like that, things are
> going to get very unpleasant very quickly.

The root problem here from the perspective of a non-committer is not
that they might have to renumber OIDs a few times over the year or two
it takes to get their patch merged, but rather that it takes a year or
two to get their patch merged.  That's not to say that I have no
sympathy with people in that situation or don't want to make their
lives easier, but I'm not really convinced that burdening committers
with additional manual steps is the right way to get patches merged
faster.  This seems like a big piece of new mechanism being invented
to solve an occasional annoyance. Your statistics are not convincing
at all: you're arguing that this is a big problem because 2-3% of
pending patches current have an issue here, and some others have in
the past, but that's a really small percentage, and the time spent
doing OID renumbering must be a tiny percentage of the total time
anyone spends hacking on PostgreSQL.

I think that the problem here is have a very limited range of OIDs
(10k) which can be used for this purpose, and the number of OIDs that
are used in that space is now a significant fraction of the total
(>4.5k), and the problem is further complicated by the desire to keep
the OIDs assigned near the low end of the available numbering space
and/or near to other OIDs used for similar purposes.  The sheer fact
that the address space is nearly half-used means that conflicts are
likely even if people choose OIDs at random, and when people choose
OIDs non-randomly -- lowest, round numbers, near to other OIDs -- the
chances of conflicts just go up.

We could fix that problem by caring less about keeping all the numbers
gapless and increasing the size of the reserved space to say 100k, but
just as a thought, what if we stopped assigning manual OIDs for new
catalog entries altogether, except for once at the end of each release
cycle?  Make a way that people can add an entry to pg_proc.dat or
whatever without fixing an OID, and let the build scripts generate
one.  As many patches as happen during a release cycle will add new
such entries and they'll just all get some OID or other.  Then, at the
end of the release cycle, we'll run a script that finds all of those
catalog entries and rewrites the .dat files, adding a permanent OID
assignment to each one, so that those OIDs will then be fixed for all
future releases (unless we drop the entries or explicitly change
something).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> This seems like a big piece of new mechanism being invented
> to solve an occasional annoyance. Your statistics are not convincing
> at all: you're arguing that this is a big problem because 2-3% of
> pending patches current have an issue here, and some others have in
> the past, but that's a really small percentage, and the time spent
> doing OID renumbering must be a tiny percentage of the total time
> anyone spends hacking on PostgreSQL.

TBH, I find this position utterly baffling.  It's true that only a
small percentage of patches have an issue here, because only a small
percentage of patches dabble in manually-assigned OIDs at all.  But
*among those that do*, there is a huge problem.  I had not actually
realized how bad it is until I gathered those stats, but it's bad.

I don't understand the objection to inventing a mechanism that will
help those patches and has no impact whatever when working on patches
that don't involve manually-assigned OIDs.

And, yeah, I'd like us not to have patches hanging around for years
either, but that's a reality that's not going away.

> We could fix that problem by caring less about keeping all the numbers
> gapless and increasing the size of the reserved space to say 100k,

We already had this discussion.  Moving FirstNormalObjectId is infeasible
without forcing a dump/reload, which I don't think anyone wants to do.

> but
> just as a thought, what if we stopped assigning manual OIDs for new
> catalog entries altogether, except for once at the end of each release
> cycle?

And that's another idea without any basis in reality.  What are you
going to do instead?  What mechanism will you use to track these
OIDs so you can clean up later?  Who's going to write the code that
will support this?  Not me.  I think the proposal that is on the
table is superior.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Robert Haas
Date:
On Thu, Feb 28, 2019 at 10:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > This seems like a big piece of new mechanism being invented
> > to solve an occasional annoyance. Your statistics are not convincing
> > at all: you're arguing that this is a big problem because 2-3% of
> > pending patches current have an issue here, and some others have in
> > the past, but that's a really small percentage, and the time spent
> > doing OID renumbering must be a tiny percentage of the total time
> > anyone spends hacking on PostgreSQL.
>
> TBH, I find this position utterly baffling.  It's true that only a
> small percentage of patches have an issue here, because only a small
> percentage of patches dabble in manually-assigned OIDs at all.  But
> *among those that do*, there is a huge problem.  I had not actually
> realized how bad it is until I gathered those stats, but it's bad.
>
> I don't understand the objection to inventing a mechanism that will
> help those patches and has no impact whatever when working on patches
> that don't involve manually-assigned OIDs.
>
> And, yeah, I'd like us not to have patches hanging around for years
> either, but that's a reality that's not going away.

I don't think this is the worst proposal ever.  However, I also think
that it's not unreasonable to raise the issue that writing OR
reviewing OR committing a patch already involves adhering to a thicket
of undocumented rules.  When somebody fails to adhere to one of those
rules, they get ignored or publicly shamed.  Now you want to add yet
another step to the process - really two.  If you want to submit a
patch that requires new catalog entries, you must know that you're
supposed to put those OIDs in this new range that we're going to set
aside for such things, and if you want to commit one, you must know
that you're suppose to renumber those OID assignments into some other
range.  And people are going to screw it up - submitters are going to
fail to know about this new policy (which will probably be documented
nowhere, just like all the other ones) - and committers are going to
fail to remember to renumber things.  So, I suspect that for every
unit of work it saves somebody, it's probably going to generate about
one unit of extra work for somebody else.

A lot of projects have a much less painful process for getting patches
integrated than we do.  I don't know how those projects maintain
adequate code quality, but I do know that making it easy to get a
patch accepted makes people more likely to contribute patches, and
increases overall development velocity.  It is not even vaguely
unreasonable to worry about whether making this more complicated is
going to hurt more than it helps, and I don't know why you think
otherwise.

> > but
> > just as a thought, what if we stopped assigning manual OIDs for new
> > catalog entries altogether, except for once at the end of each release
> > cycle?
>
> And that's another idea without any basis in reality.  What are you
> going to do instead?  What mechanism will you use to track these
> OIDs so you can clean up later?

Right now every entry in pg_proc.dat includes an OID assignment.  What
I'm proposing is that we would also allow entries that did not have
one, and the build process would assign one while processing the .dat
files.  Then later, somebody could use a script that went through and
rewrote the .dat file to add OID assignments to any entries that
lacked them.  Since the topic of having tools for automated rewrite of
those files has been discussed at length, and since we already have a
script called reformat_dat_file.pl in the tree which  contains
comments indicating that it could be modified for bulk editing, said
script having been committed BY YOU, I don't understand why you think
that bulk editing is infeasible.

> Who's going to write the code that
> will support this?  Not me.  I think the proposal that is on the
> table is superior.

OK.  Well, I think that doing nothing is superior to this proposal,
for reasons similar to what Peter Eisentraut has already articulated.
And I think rather than blasting forward with your own preferred
alternative in the face of disagreement, you should be willing to
discuss other possible options.  But if you're not willing to do that,
I can't make you.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Thu, Feb 28, 2019 at 7:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I don't think this is the worst proposal ever.  However, I also think
> that it's not unreasonable to raise the issue that writing OR
> reviewing OR committing a patch already involves adhering to a thicket
> of undocumented rules.  When somebody fails to adhere to one of those
> rules, they get ignored or publicly shamed.  Now you want to add yet
> another step to the process - really two.

There does seem to be a real problem with undocumented processes. For
example, I must confess that it came as news to me that we already had
a reserved OID range. However, I don't think that there is that much
of an issue with adding new mechanisms like this, provided it makes it
easy to do the right thing and hard to do the wrong thing. What Tom
has proposed so far is not something that self-evidently meets that
standard, but it's also not something that self-evidently fails to
meet that standard.

I have attempted to institute some general guidelines for what the
thicket of rules are by creating the "committing checklist" page. This
is necessarily imperfect, because the rules are in many cases open to
interpretation, often for good practical reasons. I don't have any
sympathy for committers that find it hard to remember to do a
catversion bump with any kind of regularity. That complexity seems
inherent, not incidental, since it's often convenient to ignore
catalog incompatibilities during development.

> So, I suspect that for every
> unit of work it saves somebody, it's probably going to generate about
> one unit of extra work for somebody else.

Maybe so. I think that you're jumping to conclusions, though.

> A lot of projects have a much less painful process for getting patches
> integrated than we do.  I don't know how those projects maintain
> adequate code quality, but I do know that making it easy to get a
> patch accepted makes people more likely to contribute patches, and
> increases overall development velocity.  It is not even vaguely
> unreasonable to worry about whether making this more complicated is
> going to hurt more than it helps, and I don't know why you think
> otherwise.

But you seem to want to make the mechanism itself even more
complicated, not less complicated (based on your remarks about making
OID assignment happen during the build). In order to make the use of
the mechanism easier. That seems worth considering, but ISTM that this
is talking at cross purposes. There are far simpler ways of making it
unlikely that a committer is going to miss this step. There is also a
simple way of noticing that they do quickly (e.g. a simple buildfarm
test).

> Right now every entry in pg_proc.dat includes an OID assignment.  What
> I'm proposing is that we would also allow entries that did not have
> one, and the build process would assign one while processing the .dat
> files.  Then later, somebody could use a script that went through and
> rewrote the .dat file to add OID assignments to any entries that
> lacked them.  Since the topic of having tools for automated rewrite of
> those files has been discussed at length, and since we already have a
> script called reformat_dat_file.pl in the tree which  contains
> comments indicating that it could be modified for bulk editing, said
> script having been committed BY YOU, I don't understand why you think
> that bulk editing is infeasible.

I'm also curious to hear what Tom thinks about this.

> OK.  Well, I think that doing nothing is superior to this proposal,
> for reasons similar to what Peter Eisentraut has already articulated.
> And I think rather than blasting forward with your own preferred
> alternative in the face of disagreement, you should be willing to
> discuss other possible options.  But if you're not willing to do that,
> I can't make you.

Peter seemed to not want to do this on the grounds that it isn't
necessary at all, whereas you think that it doesn't go far enough. If
there is a consensus against what Tom has said, it's a cacophonous one
that cannot really be said to be in favor of anything.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Thu, Feb 28, 2019 at 7:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> OK.  Well, I think that doing nothing is superior to this proposal,
>> for reasons similar to what Peter Eisentraut has already articulated.
>> And I think rather than blasting forward with your own preferred
>> alternative in the face of disagreement, you should be willing to
>> discuss other possible options.  But if you're not willing to do that,
>> I can't make you.

> Peter seemed to not want to do this on the grounds that it isn't
> necessary at all, whereas you think that it doesn't go far enough. If
> there is a consensus against what Tom has said, it's a cacophonous one
> that cannot really be said to be in favor of anything.

The only thing that's really clear is that some senior committers don't
want to be bothered because they don't think there's a problem here that
justifies any additional expenditure of their time.  Perhaps they are
right, because I'd expected some comments from non-committer developers
confirming that they see a problem, and the silence is deafening.

I'm inclined to commit some form of Naylor's tool improvement anyway,
because I have use for it; I remember times when I've renumbered OIDs
manually in patches, and it wasn't much fun.  But I can't force a
process change if there's not consensus for it among the committers.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>>> just as a thought, what if we stopped assigning manual OIDs for new
>>> catalog entries altogether, except for once at the end of each release
>>> cycle?

Actually ... that leads to an idea that wouldn't add any per-commit
overhead, or really much change at all to existing processes.  Given
the existence of a reliable OID-renumbering tool, we could:

1. Encourage people to develop new patches using chosen-at-random
high OIDs, in the 7K-9K range.  They do this already, it'd just
be encouraged instead of discouraged.

2. Commit patches as received.

3. Once each devel cycle, after feature freeze, somebody uses the
renumbering tool to shove all the new OIDs down to lower numbers,
freeing the high-OID range for the next devel cycle.  We'd have
to remember to do that, but it could be added to the RELEASE_CHANGES
checklist.

In this scheme, OID collisions are a problem for in-progress patches
only if two patches are unlucky enough to choose the same random
high OIDs during the same devel cycle.  That's unlikely, or at least
a good bit less likely than collisions are today.  If/when it does
happen we'd have a couple of alternatives for ameliorating the problem
--- either the not-yet-committed patch could use the renumbering tool
on their own OIDs, or we could do an off-schedule run of step 3 to get
the already-committed OIDs out of their way.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Thu, Feb 28, 2019 at 3:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The only thing that's really clear is that some senior committers don't
> want to be bothered because they don't think there's a problem here that
> justifies any additional expenditure of their time.  Perhaps they are
> right, because I'd expected some comments from non-committer developers
> confirming that they see a problem, and the silence is deafening.

I don't think that you can take that as too strong a signal. The
incentives are different for non-committers.

> I'm inclined to commit some form of Naylor's tool improvement anyway,
> because I have use for it; I remember times when I've renumbered OIDs
> manually in patches, and it wasn't much fun.  But I can't force a
> process change if there's not consensus for it among the committers.

I think that that's a reasonable thing to do, provided there is
obvious feedback that makes it highly unlikely that the committer will
make an error at the last moment. I have a hard time coming up with a
suggestion that won't be considered annoying by at least one person,
though.

Would it be awful if there was a #warning directive that kicked in
when the temporary OID range is in use? It should be possible to do
that without breaking -Werror builds, which I believe Robert uses (I
am reminded of the Flex bug that we used to have to work around). It's
not like there are that many patches that need to assign OIDs to new
catalog entries. I would suggest that we put the warning in the
regression tests if I didn't know that that could be missed by the use
of parallel variants, where the output flies by. There is no precedent
for using #warning for something like that, but offhand it seems like
the only thing that would work consistently.

I don't really mind having to do slightly more work when the issue
crops up, especially if that means less work for everyone involved in
aggregate, which is the cost that I'm concerned about the most.
However, an undocumented or under-documented process that requires a
fixed amount of extra mental effort when committing *anything* is
another matter.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tomas Vondra
Date:

On 3/1/19 12:41 AM, Peter Geoghegan wrote:
> On Thu, Feb 28, 2019 at 3:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The only thing that's really clear is that some senior committers don't
>> want to be bothered because they don't think there's a problem here that
>> justifies any additional expenditure of their time.  Perhaps they are
>> right, because I'd expected some comments from non-committer developers
>> confirming that they see a problem, and the silence is deafening.
> 
> I don't think that you can take that as too strong a signal. The
> incentives are different for non-committers.
> 
>> I'm inclined to commit some form of Naylor's tool improvement anyway,
>> because I have use for it; I remember times when I've renumbered OIDs
>> manually in patches, and it wasn't much fun.  But I can't force a
>> process change if there's not consensus for it among the committers.
> 
> I think that that's a reasonable thing to do, provided there is
> obvious feedback that makes it highly unlikely that the committer will
> make an error at the last moment. I have a hard time coming up with a
> suggestion that won't be considered annoying by at least one person,
> though.
> 

FWIW I personally would not mind if such tool / process was added. But I
have a related question - do we have some sort of list of such processes
that I could check? That is, a list of stuff that is expected to be done
by a committer before a commit?

I do recall we have [1], but perhaps we have something else.

https://wiki.postgresql.org/wiki/Committing_checklist


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Thu, Feb 28, 2019 at 3:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> >>> just as a thought, what if we stopped assigning manual OIDs for new
> >>> catalog entries altogether, except for once at the end of each release
> >>> cycle?
>
> Actually ... that leads to an idea that wouldn't add any per-commit
> overhead, or really much change at all to existing processes.  Given
> the existence of a reliable OID-renumbering tool, we could:

> In this scheme, OID collisions are a problem for in-progress patches
> only if two patches are unlucky enough to choose the same random
> high OIDs during the same devel cycle.  That's unlikely, or at least
> a good bit less likely than collisions are today.

That sounds like a reasonable compromise. Perhaps the unused_oids
script could give specific guidance on using a randomly determined
small range of contiguous OIDs that fall within the current range for
that devel cycle. That would prevent collisions caused by the natural
human tendency to prefer a round number. Having contiguous OIDs for
the same patch seems worth preserving.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Robert Haas
Date:
On Thu, Feb 28, 2019 at 6:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 1. Encourage people to develop new patches using chosen-at-random
> high OIDs, in the 7K-9K range.  They do this already, it'd just
> be encouraged instead of discouraged.
>
> 2. Commit patches as received.
>
> 3. Once each devel cycle, after feature freeze, somebody uses the
> renumbering tool to shove all the new OIDs down to lower numbers,
> freeing the high-OID range for the next devel cycle.  We'd have
> to remember to do that, but it could be added to the RELEASE_CHANGES
> checklist.

Sure, that sounds nice.  It seems like it might be slightly less
convenient for non-committers than what I was proposing, but still
more convenient than what they're doing right now.  And it's also more
convenient for committers, because they're not being asked to manually
fiddle patches at the last moment, something that I at least find
rather error-prone.  It also, and I think this is really good, moves
in the direction of fewer things for both patch authors and patch
committers to worry about doing wrong.  Instead of throwing rocks at
people whose OID assignments are "wrong," we just accept what people
do and adjust it later if it makes sense to do so.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Why don't we have a small reserved OID range for patch revisions?

From
Robert Haas
Date:
On Thu, Feb 28, 2019 at 5:36 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I have attempted to institute some general guidelines for what the
> thicket of rules are by creating the "committing checklist" page. This
> is necessarily imperfect, because the rules are in many cases open to
> interpretation, often for good practical reasons. I don't have any
> sympathy for committers that find it hard to remember to do a
> catversion bump with any kind of regularity. That complexity seems
> inherent, not incidental, since it's often convenient to ignore
> catalog incompatibilities during development.

Bumping catversion is something of a special case, because one does it
so often that one gets used to remembering it.  The rules are usually
not too hard to remember, although they are trickier when you don't
directly change anything in src/include/catalog but just change the
definition of some node that may be serialized in a catalog someplace.
It would be neat if there were a tool you could run to somehow tell
you whether catversion needs to be changed for a given patch.

But you know there are a log of other version numbers floating around,
like XLOG_PAGE_MAGIC or the pg_dump archive version, and it is not
really that easy to know -- as a new contributor or sometimes even as
an experienced one -- whether your work requires any changes to that
stuff, or even that that stuff *exists*.  Indeed, XLOG_PAGE_MAGIC is a
particularly annoying case, both because the constant name doesn't
contain VERSION and because the comment just says /* can be used as
WAL version indicator */ which does not exactly make it clear that if
you fail to bump it when you touch the WAL format you will Make People
Unhappy.

Indeed, Simon got complaints a number of years ago (2010, it looks
like) when he had the temerity to change the magic number to some
other unrelated value instead of just incrementing it by one.
Although I think that the criticism was to a certain extent
well-founded -- why deviate from previous practice? -- there is at the
same time something a little crazy about somebody getting excited
about the particular value that has been chosen for a number that is
described in the very name of the constant as a MAGIC number.  And
especially because there is absolutely zip in the way of code comments
or a README that explain to you how to do it "right."

> > So, I suspect that for every
> > unit of work it saves somebody, it's probably going to generate about
> > one unit of extra work for somebody else.
>
> Maybe so. I think that you're jumping to conclusions, though.

I did say "I suspect..." which was intended as a concession that I
don't know for sure.

> But you seem to want to make the mechanism itself even more
> complicated, not less complicated (based on your remarks about making
> OID assignment happen during the build). In order to make the use of
> the mechanism easier. That seems worth considering, but ISTM that this
> is talking at cross purposes. There are far simpler ways of making it
> unlikely that a committer is going to miss this step. There is also a
> simple way of noticing that they do quickly (e.g. a simple buildfarm
> test).

Well, perhaps I'm proposing some additional code, but I don't think of
that as making the mechanism more complicated.  I want to make it
simpler for patch submitters and reviewers and committers to not make
mistakes that they have to run around and fix.  If there are fewer
kinds of things that qualify as mistakes, as in Tom's latest proposal,
then we are moving in the right direction IMO regardless of anything
else.

> > OK.  Well, I think that doing nothing is superior to this proposal,
> > for reasons similar to what Peter Eisentraut has already articulated.
> > And I think rather than blasting forward with your own preferred
> > alternative in the face of disagreement, you should be willing to
> > discuss other possible options.  But if you're not willing to do that,
> > I can't make you.
>
> Peter seemed to not want to do this on the grounds that it isn't
> necessary at all, whereas you think that it doesn't go far enough. If
> there is a consensus against what Tom has said, it's a cacophonous one
> that cannot really be said to be in favor of anything.

I think Peter and I are more agreeing than we are at the opposite ends
of a spectrum, but more importantly, I think it is worth having a
discussion first about what people like and dislike, and what goals
they have, and then only if necessary, counting the votes afterwards.
I don't like having the feeling that because I have a different view
of something and want to write an email about that, I am somehow an
impediment to progress.  I think if we reduce discussions to
you're-for-it-or-your-against-it, that's not that helpful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Why don't we have a small reserved OID range for patch revisions?

From
Peter Geoghegan
Date:
On Fri, Mar 1, 2019 at 11:56 AM Robert Haas <robertmhaas@gmail.com> wrote:
> It would be neat if there were a tool you could run to somehow tell
> you whether catversion needs to be changed for a given patch.

That seems infeasible because of stored rules. A lot of things bleed
into that. We could certainly do better at documenting this on the
"committing checklist" page, though.

> Indeed, Simon got complaints a number of years ago (2010, it looks
> like) when he had the temerity to change the magic number to some
> other unrelated value instead of just incrementing it by one.

Off with his head!

> Although I think that the criticism was to a certain extent
> well-founded -- why deviate from previous practice? -- there is at the
> same time something a little crazy about somebody getting excited
> about the particular value that has been chosen for a number that is
> described in the very name of the constant as a MAGIC number.  And
> especially because there is absolutely zip in the way of code comments
> or a README that explain to you how to do it "right."

I have learned to avoid ambiguity more than anything else, because
ambiguity causes patches to flounder indefinitely, whereas it's
usually not that hard to fix something that's broken. I agree -
anything that adds ambiguity rather than taking it away is a big
problem.

> Well, perhaps I'm proposing some additional code, but I don't think of
> that as making the mechanism more complicated.  I want to make it
> simpler for patch submitters and reviewers and committers to not make
> mistakes that they have to run around and fix.

Right. So do I. I just don't think that it's that bad to ask the final
committer to do something once, rather than getting everyone else
(including committers) to do it multiple times. If we can avoid even
this burden, and totally centralize the management of the OID space,
then so much the better.

> If there are fewer
> kinds of things that qualify as mistakes, as in Tom's latest proposal,
> then we are moving in the right direction IMO regardless of anything
> else.

I'm glad that we now have a plan that is a clear step forward.

> I think Peter and I are more agreeing than we are at the opposite ends
> of a spectrum, but more importantly, I think it is worth having a
> discussion first about what people like and dislike, and what goals
> they have, and then only if necessary, counting the votes afterwards.

I agree that that's totally worthwhile.

> I don't like having the feeling that because I have a different view
> of something and want to write an email about that, I am somehow an
> impediment to progress.  I think if we reduce discussions to
> you're-for-it-or-your-against-it, that's not that helpful.

That was not my intention. The way that you brought the issue of the
difficulty of being a contributor in general into it was unhelpful,
though. It didn't seem useful or fair to link Tom's position to a big,
well known controversy.

We now have a solution that everyone is happy with, or can at least
live with, which suggests to me that Tom wasn't being intransigent or
insensitive to the concerns of contributors.

-- 
Peter Geoghegan


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Fri, Mar 1, 2019 at 11:56 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> It would be neat if there were a tool you could run to somehow tell
>> you whether catversion needs to be changed for a given patch.

> That seems infeasible because of stored rules. A lot of things bleed
> into that. We could certainly do better at documenting this on the
> "committing checklist" page, though.

A first approximation to that is "did you touch readfuncs.c", though
that rule will give a false positive if you only changed Plan nodes.

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
John Naylor <john.naylor@2ndquadrant.com> writes:
>> On 2/8/19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> A script such as you suggest might be a good way to reduce the temptation
>>> to get lazy at the last minute.  Now that the catalog data is pretty
>>> machine-readable, I suspect it wouldn't be very hard --- though I'm
>>> not volunteering either.  I'm envisioning something simple like "renumber
>>> all OIDs in range mmmm-nnnn into range xxxx-yyyy", perhaps with the
>>> ability to skip any already-used OIDs in the target range.

>> This might be something that can be done inside reformat_dat_files.pl.
>> It's a little outside it's scope, but better than the alternatives.

> Along those lines, here's a draft patch to do just that. It handles
> array type oids as well. Run it like this:

> perl  reformat_dat_file.pl  --map-from 9000  --map-to 2000  *.dat

I took a quick look at this.  I went ahead and pushed the parts that
were just code cleanup in reformat_dat_file.pl, since that seemed
pretty uncontroversial.  As far as the rest of it goes:

* I'm really not terribly happy with sticking this functionality into
reformat_dat_file.pl.  First, there's an issue of discoverability:
it's not obvious that a script named that way would have such an
ability.  Second, it clutters the script in a way that seems to me
to hinder its usefulness as a basis for one-off hacks.  So I'd really
rather have a separate script named something like "renumber_oids.pl",
even if there's a good deal of code duplication between it and
reformat_dat_file.pl.

* In my vision of what this might be good for, I think it's important
that it be possible to specify a range of input OIDs to renumber, not
just "everything above N".  I agree the output range only needs a
starting OID.

BTW, I changed the CF entry's target back to v12; I don't see a
reason not to get this done this month, and indeed kind of wish
it was available right now ;-)

            regards, tom lane


Re: Why don't we have a small reserved OID range for patch revisions?

From
John Naylor
Date:
On Sat, Mar 9, 2019 at 1:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I took a quick look at this.  I went ahead and pushed the parts that
> were just code cleanup in reformat_dat_file.pl, since that seemed
> pretty uncontroversial.  As far as the rest of it goes:

Okay, thanks.

> * I'm really not terribly happy with sticking this functionality into
> reformat_dat_file.pl.  First, there's an issue of discoverability:
> it's not obvious that a script named that way would have such an
> ability.  Second, it clutters the script in a way that seems to me
> to hinder its usefulness as a basis for one-off hacks.  So I'd really
> rather have a separate script named something like "renumber_oids.pl",
> even if there's a good deal of code duplication between it and
> reformat_dat_file.pl.

> * In my vision of what this might be good for, I think it's important
> that it be possible to specify a range of input OIDs to renumber, not
> just "everything above N".  I agree the output range only needs a
> starting OID.

Now it looks like:

perl  renumber_oids.pl  --first-mapped-oid 8000 --last-mapped-oid 8999
 --first-target-oid 2000  *.dat

To prevent a maintenance headache, I didn't copy any of the formatting
logic over. You'll also have to run reformat_dat_files.pl afterwards
to restore that. It seems to work, but I haven't tested thoroughly.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
John Naylor <john.naylor@2ndquadrant.com> writes:
> Now it looks like:
> perl  renumber_oids.pl  --first-mapped-oid 8000 --last-mapped-oid 8999
>  --first-target-oid 2000  *.dat
> To prevent a maintenance headache, I didn't copy any of the formatting
> logic over. You'll also have to run reformat_dat_files.pl afterwards
> to restore that. It seems to work, but I haven't tested thoroughly.

I didn't like the use of Data::Dumper, because it made it quite impossible
to check what the script had done by eyeball.  After some thought
I concluded that we could probably just apply the changes via
search-and-replace, which is pretty ugly and low-tech but it leads to
easily diffable results, whether or not the initial state is exactly
what reformat_dat_files would produce.

I also changed things so that the OID mapping is computed before we start
changing any files, because as it stood the objects would get renumbered
in a pretty random order; and I renamed one of the switches so they all
have unique one-letter abbreviations.

Experimenting with this, I realized that it couldn't renumber OIDs that
are defined in .h files rather than .dat files, which is a serious
deficiency, but given the search-and-replace implementation it's not too
hard to fix up the .h files as well.  So I did that, and removed the
expectation that the target files would be listed on the command line;
that seems more likely to be a foot-gun than to do anything useful.

I've successfully done check-world after renumbering every OID above
4000 to somewhere else.  I also tried renumbering everything below
4000, which unsurprisingly blew up because there are various catalog
columns we haven't fixed to use symbolic OIDs.  (The one that initdb
immediately trips over is pg_database.dattablespace.)  I'm not sure
if it's worth the trouble to make that totally clean, but I suspect
we ought to at least mop up text-search references to be symbolic.
That's material for a separate patch though.

This seems committable from my end --- any further comments?

            regards, tom lane

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 3c5830b..b6038b9 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -388,8 +388,25 @@
     to see which ones do not appear.  You can also use
     the <filename>duplicate_oids</filename> script to check for mistakes.
     (<filename>genbki.pl</filename> will assign OIDs for any rows that
-    didn't get one hand-assigned to them and also detect duplicate OIDs
-    at compile time.)
+    didn't get one hand-assigned to them, and it will also detect duplicate
+    OIDs at compile time.)
+   </para>
+
+   <para>
+    When choosing OIDs for a patch that is not expected to be committed
+    immediately, best practice is to use a group of more-or-less
+    consecutive OIDs starting with some random choice in the range
+    8000—9999.  This minimizes the risk of OID collisions with other
+    patches being developed concurrently.  To keep the 8000—9999
+    range free for development purposes, after a patch has been committed
+    to the master git repository its OIDs should be renumbered into
+    available space below that range.  Typically, this will be done
+    near the end of each development cycle, moving all OIDs consumed by
+    patches committed in that cycle at the same time.  The script
+    <filename>renumber_oids.pl</filename> can be used for this purpose.
+    If an uncommitted patch is found to have OID conflicts with some
+    recently-committed patch, <filename>renumber_oids.pl</filename> may
+    also be useful for recovering from that situation.
    </para>

    <para>
diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index 3bf308f..368b1de 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -87,6 +87,8 @@ sub ParseHeader
         }

         # Push the data into the appropriate data structure.
+        # Caution: when adding new recognized OID-defining macros,
+        # also update src/include/catalog/renumber_oids.pl.
         if (/^DECLARE_TOAST\(\s*(\w+),\s*(\d+),\s*(\d+)\)/)
         {
             push @{ $catalog{toasting} },
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 833fad1..f253613 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -4,6 +4,9 @@
  *      This file provides some definitions to support indexing
  *      on system catalogs
  *
+ * Caution: all #define's with numeric values in this file had better be
+ * object OIDs, else renumber_oids.pl might change them inappropriately.
+ *
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
diff --git a/src/include/catalog/renumber_oids.pl b/src/include/catalog/renumber_oids.pl
new file mode 100755
index 0000000..8a07340
--- /dev/null
+++ b/src/include/catalog/renumber_oids.pl
@@ -0,0 +1,291 @@
+#!/usr/bin/perl
+#----------------------------------------------------------------------
+#
+# renumber_oids.pl
+#    Perl script that shifts a range of OIDs in the Postgres catalog data
+#    to a different range, skipping any OIDs that are already in use.
+#
+#    Note: This does not reformat the .dat files, so you may want
+#    to run reformat_dat_file.pl afterwards.
+#
+# Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/include/catalog/renumber_oids.pl
+#
+#----------------------------------------------------------------------
+
+use strict;
+use warnings;
+
+use FindBin;
+use Getopt::Long;
+
+# Must run in src/include/catalog
+chdir $FindBin::RealBin or die "could not cd to $FindBin::RealBin: $!\n";
+
+use lib "$FindBin::RealBin/../../backend/catalog/";
+use Catalog;
+
+# We'll need this number.
+my $FirstGenbkiObjectId =
+  Catalog::FindDefinedSymbol('access/transam.h', '..', 'FirstGenbkiObjectId');
+
+# Process command line switches.
+my $output_path      = '';
+my $first_mapped_oid = 0;
+my $last_mapped_oid  = $FirstGenbkiObjectId - 1;
+my $target_oid       = 0;
+
+GetOptions(
+    'output=s'           => \$output_path,
+    'first-mapped-oid=i' => \$first_mapped_oid,
+    'last-mapped-oid=i'  => \$last_mapped_oid,
+    'target-oid=i'       => \$target_oid) || usage();
+
+# Sanity check arguments.
+die "Unexpected non-switch arguments.\n" if @ARGV;
+die "--first-mapped-oid must be specified.\n"
+  if $first_mapped_oid <= 0;
+die "Empty mapped OID range.\n"
+  if $last_mapped_oid < $first_mapped_oid;
+die "--target-oid must be specified.\n"
+  if $target_oid <= 0;
+die "--target-oid must not be within mapped OID range.\n"
+  if $target_oid >= $first_mapped_oid && $target_oid <= $last_mapped_oid;
+
+# Make sure output_path ends in a slash.
+if ($output_path ne '' && substr($output_path, -1) ne '/')
+{
+    $output_path .= '/';
+}
+
+# Collect all the existing assigned OIDs (including those to be remapped).
+my @header_files = (glob("pg_*.h"), qw(indexing.h toasting.h));
+my $oids = Catalog::FindAllOidsFromHeaders(@header_files);
+
+# Hash-ify the existing OIDs for convenient lookup.
+my %oidhash;
+@oidhash{@$oids} = undef;
+
+# Select new OIDs for existing OIDs in the mapped range.
+# We do this first so that we preserve the ordering of the mapped OIDs
+# (for reproducibility's sake), and so that if we fail due to running out
+# of OID room, that happens before we've overwritten any files.
+my %maphash;
+my $next_oid = $target_oid;
+
+for (
+    my $mapped_oid = $first_mapped_oid;
+    $mapped_oid <= $last_mapped_oid;
+    $mapped_oid++)
+{
+    next if !exists $oidhash{$mapped_oid};
+    $next_oid++
+      while (
+        exists $oidhash{$next_oid}
+        || (   $next_oid >= $first_mapped_oid
+            && $next_oid <= $last_mapped_oid));
+    die "Reached FirstGenbkiObjectId before assigning all OIDs.\n"
+      if $next_oid >= $FirstGenbkiObjectId;
+    $maphash{$mapped_oid} = $next_oid;
+    $next_oid++;
+}
+
+die "There are no OIDs in the mapped range.\n" if $next_oid == $target_oid;
+
+# Read each .h file and write out modified data.
+foreach my $input_file (@header_files)
+{
+    $input_file =~ /(\w+)\.h$/
+      or die "Input file $input_file needs to be a .h file.\n";
+    my $catname = $1;
+
+    # Ignore generated *_d.h files.
+    next if $catname =~ /_d$/;
+
+    open(my $ifd, '<', $input_file) || die "$input_file: $!";
+
+    # Write output files to specified directory.
+    # Use a .tmp suffix, then rename into place, in case we're overwriting.
+    my $output_file     = "$output_path$catname.h";
+    my $tmp_output_file = "$output_file.tmp";
+    open my $ofd, '>', $tmp_output_file
+      or die "can't open $tmp_output_file: $!";
+    my $changed = 0;
+
+    # Scan the input file.
+    while (<$ifd>)
+    {
+        my $line = $_;
+
+        # Check for OID-defining macros that Catalog::ParseHeader knows about,
+        # and update OIDs as needed.
+        if ($line =~ m/^(DECLARE_TOAST\(\s*\w+,\s*)(\d+)(,\s*)(\d+)\)/)
+        {
+            my $oid2 = $2;
+            my $oid4 = $4;
+            if (exists $maphash{$oid2})
+            {
+                $oid2 = $maphash{$oid2};
+                my $repl = $1 . $oid2 . $3 . $oid4 . ")";
+                $line =~ s/^DECLARE_TOAST\(\s*\w+,\s*\d+,\s*\d+\)/$repl/;
+                $changed = 1;
+            }
+            if (exists $maphash{$oid4})
+            {
+                $oid4 = $maphash{$oid4};
+                my $repl = $1 . $oid2 . $3 . $oid4 . ")";
+                $line =~ s/^DECLARE_TOAST\(\s*\w+,\s*\d+,\s*\d+\)/$repl/;
+                $changed = 1;
+            }
+        }
+        elsif (
+            $line =~ m/^(DECLARE_(UNIQUE_)?INDEX\(\s*\w+,\s*)(\d+)(,\s*.+)\)/)
+        {
+            if (exists $maphash{$3})
+            {
+                my $repl = $1 . $maphash{$3} . $4 . ")";
+                $line =~
+                  s/^DECLARE_(UNIQUE_)?INDEX\(\s*\w+,\s*\d+,\s*.+\)/$repl/;
+                $changed = 1;
+            }
+        }
+        elsif ($line =~ m/^CATALOG\((\w+),(\d+),(\w+)\)/)
+        {
+            if (exists $maphash{$2})
+            {
+                my $repl =
+                  "CATALOG(" . $1 . "," . $maphash{$2} . "," . $3 . ")";
+                $line =~ s/^CATALOG\(\w+,\d+,\w+\)/$repl/;
+                $changed = 1;
+            }
+
+            if ($line =~ m/BKI_ROWTYPE_OID\((\d+),(\w+)\)/)
+            {
+                if (exists $maphash{$1})
+                {
+                    my $repl =
+                      "BKI_ROWTYPE_OID(" . $maphash{$1} . "," . $2 . ")";
+                    $line =~ s/BKI_ROWTYPE_OID\(\d+,\w+\)/$repl/;
+                    $changed = 1;
+                }
+            }
+        }
+
+        # In indexing.h and toasting.h only, check for #define SYM nnnn,
+        # and replace if within mapped range.
+        elsif ($line =~ m/^(\s*#\s*define\s+\w+\s+)(\d+)\b/)
+        {
+            if (($catname eq 'indexing' || $catname eq 'toasting')
+                && exists $maphash{$2})
+            {
+                my $repl = $1 . $maphash{$2};
+                $line =~ s/^\s*#\s*define\s+\w+\s+\d+\b/$repl/;
+                $changed = 1;
+            }
+        }
+
+        print $ofd $line;
+    }
+
+    close $ifd;
+    close $ofd;
+
+    # Avoid updating files if we didn't change them.
+    if ($changed || $output_path ne '')
+    {
+        rename $tmp_output_file, $output_file
+          or die "can't rename $tmp_output_file to $output_file: $!";
+    }
+    else
+    {
+        unlink $tmp_output_file
+          or die "can't unlink $tmp_output_file: $!";
+    }
+}
+
+# Likewise, read each .dat file and write out modified data.
+foreach my $input_file (glob("pg_*.dat"))
+{
+    $input_file =~ /(\w+)\.dat$/
+      or die "Input file $input_file needs to be a .dat file.\n";
+    my $catname = $1;
+
+    open(my $ifd, '<', $input_file) || die "$input_file: $!";
+
+    # Write output files to specified directory.
+    # Use a .tmp suffix, then rename into place, in case we're overwriting.
+    my $output_file     = "$output_path$catname.dat";
+    my $tmp_output_file = "$output_file.tmp";
+    open my $ofd, '>', $tmp_output_file
+      or die "can't open $tmp_output_file: $!";
+    my $changed = 0;
+
+    # Scan the input file.
+    while (<$ifd>)
+    {
+        my $line = $_;
+
+        # Check for oid => 'nnnn', and replace if within mapped range.
+        if ($line =~ m/\b(oid\s*=>\s*)'(\d+)'/)
+        {
+            if (exists $maphash{$2})
+            {
+                my $repl = $1 . "'" . $maphash{$2} . "'";
+                $line =~ s/\boid\s*=>\s*'\d+'/$repl/;
+                $changed = 1;
+            }
+        }
+
+        # Likewise for array_type_oid.
+        if ($line =~ m/\b(array_type_oid\s*=>\s*)'(\d+)'/)
+        {
+            if (exists $maphash{$2})
+            {
+                my $repl = $1 . "'" . $maphash{$2} . "'";
+                $line =~ s/\barray_type_oid\s*=>\s*'\d+'/$repl/;
+                $changed = 1;
+            }
+        }
+
+        print $ofd $line;
+    }
+
+    close $ifd;
+    close $ofd;
+
+    # Avoid updating files if we didn't change them.
+    if ($changed || $output_path ne '')
+    {
+        rename $tmp_output_file, $output_file
+          or die "can't rename $tmp_output_file to $output_file: $!";
+    }
+    else
+    {
+        unlink $tmp_output_file
+          or die "can't unlink $tmp_output_file: $!";
+    }
+}
+
+sub usage
+{
+    my $last = $FirstGenbkiObjectId - 1;
+    die <<EOM;
+Usage: renumber_oids.pl [--output PATH] --first-mapped-oid X [--last-mapped-oid Y] --target-oid Z
+
+Options:
+    --output PATH           output directory (default '.')
+    --first-mapped-oid X    first OID to be moved
+    --last-mapped-oid Y     last OID to be moved (default $last)
+    --target-oid Z          first OID to move to
+
+Catalog *.h and *.dat files are updated and written to the
+output directory; by default, this overwrites the input files.
+
+Caution: the output PATH will be interpreted relative to
+src/include/catalog, even if you start the script
+in some other directory.
+
+EOM
+}
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index 3796971..5ee628c 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -3,6 +3,9 @@
  * toasting.h
  *      This file provides some definitions to support creation of toast tables
  *
+ * Caution: all #define's with numeric values in this file had better be
+ * object OIDs, else renumber_oids.pl might change them inappropriately.
+ *
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
diff --git a/src/tools/RELEASE_CHANGES b/src/tools/RELEASE_CHANGES
index edfd56e..920c6e9 100644
--- a/src/tools/RELEASE_CHANGES
+++ b/src/tools/RELEASE_CHANGES
@@ -58,11 +58,23 @@ in both HEAD and the branch.
     o doc/src/sgml/ref manual pages

 * Ports
-    o update config.guess and config.sub at the start of beta
-      (from http://savannah.gnu.org/projects/config)
     o update ports list in doc/src/sgml/installation.sgml
     o update platform-specific FAQ's, if needed

+
+Pre-Beta Tasks
+==============
+
+These things should be done at least once per development cycle.
+Typically we do them between feature freeze and start of beta test,
+but there may be reasons to do them at other times as well.
+
+* Renumber any manually-assigned OIDs between 8000 and 9999
+  to lower numbers, using renumber_oids.pl (see notes in bki.sgml)
+
+* Update config.guess and config.sub
+  (from http://savannah.gnu.org/projects/config)
+
 * Update inet/cidr data types with newest Bind patches



Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
I wrote:
> I've successfully done check-world after renumbering every OID above
> 4000 to somewhere else.  I also tried renumbering everything below
> 4000, which unsurprisingly blew up because there are various catalog
> columns we haven't fixed to use symbolic OIDs.  (The one that initdb
> immediately trips over is pg_database.dattablespace.)  I'm not sure
> if it's worth the trouble to make that totally clean, but I suspect
> we ought to at least mop up text-search references to be symbolic.
> That's material for a separate patch though.

So I couldn't resist poking at that, and after a couple hours' work
I have the attached patch, which removes all remaining hard-wired
OID references in the .dat files.

Using this, I renumbered all the OIDs in include/catalog, and behold
things pretty much worked.  I got through check-world after hacking
up these points:

* Unsurprisingly, there are lots of regression tests that have object
OIDs hard-wired in queries and/or expected output.

* initdb.c has a couple of places that know that template1 has OID 1.

* information_schema.sql has several SQL-language functions that
hard-wire the OIDs of assorted built-in types.

I'm not particularly fussed about the first two points, but the
last is a bit worrisome.  It's not too hard to imagine somebody
adding knowledge of their new type to those functions, and the
code getting broken by a renumbering pass, and us not noticing
if the point isn't stressed by a regression test (which mostly
those functions aren't).

We could imagine fixing those functions along the lines of

   CASE WHEN $2 = -1 /* default typmod */
        THEN null
-       WHEN $1 IN (1042, 1043) /* char, varchar */
+       WHEN $1 IN ('pg_catalog.bpchar'::pg_catalog.regtype,
+                   'pg_catalog.varchar'::pg_catalog.regtype)
        THEN $2 - 4

which would add some parsing overhead, but I'm not sure if anyone
would notice that.

            regards, tom lane

diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 10c2b24..ce159ae 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -174,6 +174,20 @@ foreach my $row (@{ $catalog_data{pg_am} })
     $amoids{ $row->{amname} } = $row->{oid};
 }

+# class (relation) OID lookup (note this only covers bootstrap catalogs!)
+my %classoids;
+foreach my $row (@{ $catalog_data{pg_class} })
+{
+    $classoids{ $row->{relname} } = $row->{oid};
+}
+
+# collation OID lookup
+my %collationoids;
+foreach my $row (@{ $catalog_data{pg_collation} })
+{
+    $collationoids{ $row->{collname} } = $row->{oid};
+}
+
 # language OID lookup
 my %langoids;
 foreach my $row (@{ $catalog_data{pg_language} })
@@ -243,6 +257,41 @@ foreach my $row (@{ $catalog_data{pg_proc} })
     }
 }

+# tablespace OID lookup
+my %tablespaceoids;
+foreach my $row (@{ $catalog_data{pg_tablespace} })
+{
+    $tablespaceoids{ $row->{spcname} } = $row->{oid};
+}
+
+# text search configuration OID lookup
+my %tsconfigoids;
+foreach my $row (@{ $catalog_data{pg_ts_config} })
+{
+    $tsconfigoids{ $row->{cfgname} } = $row->{oid};
+}
+
+# text search dictionary OID lookup
+my %tsdictoids;
+foreach my $row (@{ $catalog_data{pg_ts_dict} })
+{
+    $tsdictoids{ $row->{dictname} } = $row->{oid};
+}
+
+# text search parser OID lookup
+my %tsparseroids;
+foreach my $row (@{ $catalog_data{pg_ts_parser} })
+{
+    $tsparseroids{ $row->{prsname} } = $row->{oid};
+}
+
+# text search template OID lookup
+my %tstemplateoids;
+foreach my $row (@{ $catalog_data{pg_ts_template} })
+{
+    $tstemplateoids{ $row->{tmplname} } = $row->{oid};
+}
+
 # type lookups
 my %typeoids;
 my %types;
@@ -287,14 +336,21 @@ close $ef;

 # Map lookup name to the corresponding hash table.
 my %lookup_kind = (
-    pg_am       => \%amoids,
-    pg_language => \%langoids,
-    pg_opclass  => \%opcoids,
-    pg_operator => \%operoids,
-    pg_opfamily => \%opfoids,
-    pg_proc     => \%procoids,
-    pg_type     => \%typeoids,
-    encoding    => \%encids);
+    pg_am          => \%amoids,
+    pg_class       => \%classoids,
+    pg_collation   => \%collationoids,
+    pg_language    => \%langoids,
+    pg_opclass     => \%opcoids,
+    pg_operator    => \%operoids,
+    pg_opfamily    => \%opfoids,
+    pg_proc        => \%procoids,
+    pg_tablespace  => \%tablespaceoids,
+    pg_ts_config   => \%tsconfigoids,
+    pg_ts_dict     => \%tsdictoids,
+    pg_ts_parser   => \%tsparseroids,
+    pg_ts_template => \%tstemplateoids,
+    pg_type        => \%typeoids,
+    encoding       => \%encids);


 # Open temp files
@@ -730,8 +786,8 @@ sub morph_row_for_pgattr
     $row->{attndims} = $type->{typcategory} eq 'A' ? '1' : '0';

     # collation-aware catalog columns must use C collation
-    $row->{attcollation} = $type->{typcollation} != 0 ?
-        $C_COLLATION_OID : 0;
+    $row->{attcollation} =
+      $type->{typcollation} ne '0' ? $C_COLLATION_OID : 0;

     if (defined $attr->{forcenotnull})
     {
diff --git a/src/include/catalog/pg_class.dat b/src/include/catalog/pg_class.dat
index 5a1f3aa..511d876 100644
--- a/src/include/catalog/pg_class.dat
+++ b/src/include/catalog/pg_class.dat
@@ -21,48 +21,44 @@
 # similarly, "1" in relminmxid stands for FirstMultiXactId

 { oid => '1247',
-  relname => 'pg_type', relnamespace => 'PGNSP', reltype => '71',
-  reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM',
-  relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0',
-  relallvisible => '0', reltoastrelid => '0', relhasindex => 'f',
-  relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '31',
-  relchecks => '0', relhasrules => 'f', relhastriggers => 'f',
-  relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f',
-  relispopulated => 't', relreplident => 'n', relispartition => 'f',
-  relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
+  relname => 'pg_type', reltype => 'pg_type', relam => 'PGHEAPAM',
+  relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0',
+  reltoastrelid => '0', relhasindex => 'f', relisshared => 'f',
+  relpersistence => 'p', relkind => 'r', relnatts => '31', relchecks => '0',
+  relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f',
+  relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't',
+  relreplident => 'n', relispartition => 'f', relrewrite => '0',
+  relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
   reloptions => '_null_', relpartbound => '_null_' },
 { oid => '1249',
-  relname => 'pg_attribute', relnamespace => 'PGNSP', reltype => '75',
-  reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM',
-  relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0',
-  relallvisible => '0', reltoastrelid => '0', relhasindex => 'f',
-  relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '24',
-  relchecks => '0', relhasrules => 'f', relhastriggers => 'f',
-  relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f',
-  relispopulated => 't', relreplident => 'n', relispartition => 'f',
-  relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
+  relname => 'pg_attribute', reltype => 'pg_attribute', relam => 'PGHEAPAM',
+  relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0',
+  reltoastrelid => '0', relhasindex => 'f', relisshared => 'f',
+  relpersistence => 'p', relkind => 'r', relnatts => '24', relchecks => '0',
+  relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f',
+  relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't',
+  relreplident => 'n', relispartition => 'f', relrewrite => '0',
+  relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
   reloptions => '_null_', relpartbound => '_null_' },
 { oid => '1255',
-  relname => 'pg_proc', relnamespace => 'PGNSP', reltype => '81',
-  reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM',
-  relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0',
-  relallvisible => '0', reltoastrelid => '0', relhasindex => 'f',
-  relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '29',
-  relchecks => '0', relhasrules => 'f', relhastriggers => 'f',
-  relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f',
-  relispopulated => 't', relreplident => 'n', relispartition => 'f',
-  relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
+  relname => 'pg_proc', reltype => 'pg_proc', relam => 'PGHEAPAM',
+  relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0',
+  reltoastrelid => '0', relhasindex => 'f', relisshared => 'f',
+  relpersistence => 'p', relkind => 'r', relnatts => '29', relchecks => '0',
+  relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f',
+  relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't',
+  relreplident => 'n', relispartition => 'f', relrewrite => '0',
+  relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
   reloptions => '_null_', relpartbound => '_null_' },
 { oid => '1259',
-  relname => 'pg_class', relnamespace => 'PGNSP', reltype => '83',
-  reloftype => '0', relowner => 'PGUID', relam => 'PGHEAPAM',
-  relfilenode => '0', reltablespace => '0', relpages => '0', reltuples => '0',
-  relallvisible => '0', reltoastrelid => '0', relhasindex => 'f',
-  relisshared => 'f', relpersistence => 'p', relkind => 'r', relnatts => '33',
-  relchecks => '0', relhasrules => 'f', relhastriggers => 'f',
-  relhassubclass => 'f', relrowsecurity => 'f', relforcerowsecurity => 'f',
-  relispopulated => 't', relreplident => 'n', relispartition => 'f',
-  relrewrite => '0', relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
+  relname => 'pg_class', reltype => 'pg_class', relam => 'PGHEAPAM',
+  relfilenode => '0', relpages => '0', reltuples => '0', relallvisible => '0',
+  reltoastrelid => '0', relhasindex => 'f', relisshared => 'f',
+  relpersistence => 'p', relkind => 'r', relnatts => '33', relchecks => '0',
+  relhasrules => 'f', relhastriggers => 'f', relhassubclass => 'f',
+  relrowsecurity => 'f', relforcerowsecurity => 'f', relispopulated => 't',
+  relreplident => 'n', relispartition => 'f', relrewrite => '0',
+  relfrozenxid => '3', relminmxid => '1', relacl => '_null_',
   reloptions => '_null_', relpartbound => '_null_' },

 ]
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index ad698c9..d88703d 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -28,56 +28,113 @@
  */
 CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,RelationRelation_Rowtype_Id)
BKI_SCHEMA_MACRO
 {
-    Oid            oid;            /* oid */
-    NameData    relname;        /* class name */
-    Oid            relnamespace;    /* OID of namespace containing this class */
-    Oid            reltype;        /* OID of entry in pg_type for table's
-                                 * implicit row type */
-    Oid            reloftype;        /* OID of entry in pg_type for underlying
-                                 * composite type */
-    Oid            relowner;        /* class owner */
-    Oid            relam;            /* access method; 0 if not a table / index */
-    Oid            relfilenode;    /* identifier of physical storage file */
+    /* oid */
+    Oid            oid;

+    /* class name */
+    NameData    relname;
+
+    /* OID of namespace containing this class */
+    Oid            relnamespace BKI_DEFAULT(PGNSP);
+
+    /* OID of entry in pg_type for table's implicit row type */
+    Oid            reltype BKI_LOOKUP(pg_type);
+
+    /* OID of entry in pg_type for underlying composite type */
+    Oid            reloftype BKI_DEFAULT(0) BKI_LOOKUP(pg_type);
+
+    /* class owner */
+    Oid            relowner BKI_DEFAULT(PGUID);
+
+    /* access method; 0 if not a table / index */
+    Oid            relam;
+
+    /* identifier of physical storage file */
     /* relfilenode == 0 means it is a "mapped" relation, see relmapper.c */
-    Oid            reltablespace;    /* identifier of table space for relation */
-    int32        relpages;        /* # of blocks (not always up-to-date) */
-    float4        reltuples;        /* # of tuples (not always up-to-date) */
-    int32        relallvisible;    /* # of all-visible blocks (not always
-                                 * up-to-date) */
-    Oid            reltoastrelid;    /* OID of toast table; 0 if none */
-    bool        relhasindex;    /* T if has (or has had) any indexes */
-    bool        relisshared;    /* T if shared across databases */
-    char        relpersistence; /* see RELPERSISTENCE_xxx constants below */
-    char        relkind;        /* see RELKIND_xxx constants below */
-    int16        relnatts;        /* number of user attributes */
+    Oid            relfilenode;
+
+    /* identifier of table space for relation (0 means default for database) */
+    Oid            reltablespace BKI_DEFAULT(0) BKI_LOOKUP(pg_tablespace);
+
+    /* # of blocks (not always up-to-date) */
+    int32        relpages;
+
+    /* # of tuples (not always up-to-date) */
+    float4        reltuples;
+
+    /* # of all-visible blocks (not always up-to-date) */
+    int32        relallvisible;
+
+    /* OID of toast table; 0 if none */
+    Oid            reltoastrelid;
+
+    /* T if has (or has had) any indexes */
+    bool        relhasindex;
+
+    /* T if shared across databases */
+    bool        relisshared;
+
+    /* see RELPERSISTENCE_xxx constants below */
+    char        relpersistence;
+
+    /* see RELKIND_xxx constants below */
+    char        relkind;
+
+    /* number of user attributes */
+    int16        relnatts;

     /*
      * Class pg_attribute must contain exactly "relnatts" user attributes
      * (with attnums ranging from 1 to relnatts) for this class.  It may also
      * contain entries with negative attnums for system attributes.
      */
-    int16        relchecks;        /* # of CHECK constraints for class */
-    bool        relhasrules;    /* has (or has had) any rules */
-    bool        relhastriggers; /* has (or has had) any TRIGGERs */
-    bool        relhassubclass; /* has (or has had) child tables or indexes */
-    bool        relrowsecurity; /* row security is enabled or not */
-    bool        relforcerowsecurity;    /* row security forced for owners or
-                                         * not */
-    bool        relispopulated; /* matview currently holds query results */
-    char        relreplident;    /* see REPLICA_IDENTITY_xxx constants  */
-    bool        relispartition; /* is relation a partition? */
-    Oid            relrewrite;        /* heap for rewrite during DDL, link to
-                                 * original rel */
-    TransactionId relfrozenxid; /* all Xids < this are frozen in this rel */
-    TransactionId relminmxid;    /* all multixacts in this rel are >= this.
-                                 * this is really a MultiXactId */
+
+    /* # of CHECK constraints for class */
+    int16        relchecks;
+
+    /* has (or has had) any rules */
+    bool        relhasrules;
+
+    /* has (or has had) any TRIGGERs */
+    bool        relhastriggers;
+
+    /* has (or has had) child tables or indexes */
+    bool        relhassubclass;
+
+    /* row security is enabled or not */
+    bool        relrowsecurity;
+
+    /* row security forced for owners or not */
+    bool        relforcerowsecurity;
+
+    /* matview currently holds query results */
+    bool        relispopulated;
+
+    /* see REPLICA_IDENTITY_xxx constants */
+    char        relreplident;
+
+    /* is relation a partition? */
+    bool        relispartition;
+
+    /* heap for rewrite during DDL, link to original rel */
+    Oid            relrewrite;
+
+    /* all Xids < this are frozen in this rel */
+    TransactionId relfrozenxid;
+
+    /* all multixacts in this rel are >= this; it is really a MultiXactId */
+    TransactionId relminmxid;

 #ifdef CATALOG_VARLEN            /* variable-length fields start here */
     /* NOTE: These fields are not present in a relcache entry's rd_rel field. */
-    aclitem        relacl[1];        /* access permissions */
-    text        reloptions[1];    /* access-method-specific options */
-    pg_node_tree relpartbound;    /* partition bound node tree */
+    /* access permissions */
+    aclitem        relacl[1];
+
+    /* access-method-specific options */
+    text        reloptions[1];
+
+    /* partition bound node tree */
+    pg_node_tree relpartbound;
 #endif
 } FormData_pg_class;

diff --git a/src/include/catalog/pg_database.dat b/src/include/catalog/pg_database.dat
index cbd91bc..89bd75d 100644
--- a/src/include/catalog/pg_database.dat
+++ b/src/include/catalog/pg_database.dat
@@ -14,10 +14,9 @@

 { oid => '1', oid_symbol => 'TemplateDbOid',
   descr => 'default template for new databases',
-  datname => 'template1', datdba => 'PGUID', encoding => 'ENCODING',
-  datcollate => 'LC_COLLATE', datctype => 'LC_CTYPE', datistemplate => 't',
-  datallowconn => 't', datconnlimit => '-1', datlastsysoid => '0',
-  datfrozenxid => '0', datminmxid => '1', dattablespace => '1663',
-  datacl => '_null_' },
+  datname => 'template1', encoding => 'ENCODING', datcollate => 'LC_COLLATE',
+  datctype => 'LC_CTYPE', datistemplate => 't', datallowconn => 't',
+  datconnlimit => '-1', datlastsysoid => '0', datfrozenxid => '0',
+  datminmxid => '1', dattablespace => 'pg_default', datacl => '_null_' },

 ]
diff --git a/src/include/catalog/pg_database.h b/src/include/catalog/pg_database.h
index 63e8efa..06fea45 100644
--- a/src/include/catalog/pg_database.h
+++ b/src/include/catalog/pg_database.h
@@ -28,22 +28,48 @@
  */
 CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248,DatabaseRelation_Rowtype_Id)
BKI_SCHEMA_MACRO
 {
-    Oid            oid;            /* oid */
-    NameData    datname;        /* database name */
-    Oid            datdba;            /* owner of database */
-    int32        encoding;        /* character encoding */
-    NameData    datcollate;        /* LC_COLLATE setting */
-    NameData    datctype;        /* LC_CTYPE setting */
-    bool        datistemplate;    /* allowed as CREATE DATABASE template? */
-    bool        datallowconn;    /* new connections allowed? */
-    int32        datconnlimit;    /* max connections allowed (-1=no limit) */
-    Oid            datlastsysoid;    /* highest OID to consider a system OID */
-    TransactionId datfrozenxid; /* all Xids < this are frozen in this DB */
-    TransactionId datminmxid;    /* all multixacts in the DB are >= this */
-    Oid            dattablespace;    /* default table space for this DB */
+    /* oid */
+    Oid            oid;
+
+    /* database name */
+    NameData    datname;
+
+    /* owner of database */
+    Oid            datdba BKI_DEFAULT(PGUID);
+
+    /* character encoding */
+    int32        encoding;
+
+    /* LC_COLLATE setting */
+    NameData    datcollate;
+
+    /* LC_CTYPE setting */
+    NameData    datctype;
+
+    /* allowed as CREATE DATABASE template? */
+    bool        datistemplate;
+
+    /* new connections allowed? */
+    bool        datallowconn;
+
+    /* max connections allowed (-1=no limit) */
+    int32        datconnlimit;
+
+    /* highest OID to consider a system OID */
+    Oid            datlastsysoid;
+
+    /* all Xids < this are frozen in this DB */
+    TransactionId datfrozenxid;
+
+    /* all multixacts in the DB are >= this */
+    TransactionId datminmxid;
+
+    /* default table space for this DB */
+    Oid            dattablespace BKI_LOOKUP(pg_tablespace);

 #ifdef CATALOG_VARLEN            /* variable-length fields start here */
-    aclitem        datacl[1];        /* access permissions */
+    /* access permissions */
+    aclitem        datacl[1];
 #endif
 } FormData_pg_database;

diff --git a/src/include/catalog/pg_ts_config.dat b/src/include/catalog/pg_ts_config.dat
index 47d2342..bddaa81 100644
--- a/src/include/catalog/pg_ts_config.dat
+++ b/src/include/catalog/pg_ts_config.dat
@@ -13,7 +13,6 @@
 [

 { oid => '3748', descr => 'simple configuration',
-  cfgname => 'simple', cfgnamespace => 'PGNSP', cfgowner => 'PGUID',
-  cfgparser => '3722' },
+  cfgname => 'simple', cfgparser => 'default' },

 ]
diff --git a/src/include/catalog/pg_ts_config.h b/src/include/catalog/pg_ts_config.h
index 4531c3e..7ab97a8 100644
--- a/src/include/catalog/pg_ts_config.h
+++ b/src/include/catalog/pg_ts_config.h
@@ -29,11 +29,20 @@
  */
 CATALOG(pg_ts_config,3602,TSConfigRelationId)
 {
-    Oid            oid;            /* oid */
-    NameData    cfgname;        /* name of configuration */
-    Oid            cfgnamespace;    /* name space */
-    Oid            cfgowner;        /* owner */
-    Oid            cfgparser;        /* OID of parser (in pg_ts_parser) */
+    /* oid */
+    Oid            oid;
+
+    /* name of configuration */
+    NameData    cfgname;
+
+    /* name space */
+    Oid            cfgnamespace BKI_DEFAULT(PGNSP);
+
+    /* owner */
+    Oid            cfgowner BKI_DEFAULT(PGUID);
+
+    /* OID of parser */
+    Oid            cfgparser BKI_LOOKUP(pg_ts_parser);
 } FormData_pg_ts_config;

 typedef FormData_pg_ts_config *Form_pg_ts_config;
diff --git a/src/include/catalog/pg_ts_config_map.dat b/src/include/catalog/pg_ts_config_map.dat
index 3ce4e16..43a8bd4 100644
--- a/src/include/catalog/pg_ts_config_map.dat
+++ b/src/include/catalog/pg_ts_config_map.dat
@@ -12,24 +12,43 @@

 [

-{ mapcfg => '3748', maptokentype => '1', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '2', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '3', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '4', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '5', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '6', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '7', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '8', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '9', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '10', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '11', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '15', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '16', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '17', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '18', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '19', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '20', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '21', mapseqno => '1', mapdict => '3765' },
-{ mapcfg => '3748', maptokentype => '22', mapseqno => '1', mapdict => '3765' },
+{ mapcfg => 'simple', maptokentype => '1', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '2', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '3', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '4', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '5', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '6', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '7', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '8', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '9', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '10', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '11', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '15', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '16', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '17', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '18', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '19', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '20', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '21', mapseqno => '1',
+  mapdict => 'simple' },
+{ mapcfg => 'simple', maptokentype => '22', mapseqno => '1',
+  mapdict => 'simple' },

 ]
diff --git a/src/include/catalog/pg_ts_config_map.h b/src/include/catalog/pg_ts_config_map.h
index dd2de48..7892e17 100644
--- a/src/include/catalog/pg_ts_config_map.h
+++ b/src/include/catalog/pg_ts_config_map.h
@@ -29,10 +29,17 @@
  */
 CATALOG(pg_ts_config_map,3603,TSConfigMapRelationId)
 {
-    Oid            mapcfg;            /* OID of configuration owning this entry */
-    int32        maptokentype;    /* token type from parser */
-    int32        mapseqno;        /* order in which to consult dictionaries */
-    Oid            mapdict;        /* dictionary to consult */
+    /* OID of configuration owning this entry */
+    Oid            mapcfg BKI_LOOKUP(pg_ts_config);
+
+    /* token type from parser */
+    int32        maptokentype;
+
+    /* order in which to consult dictionaries */
+    int32        mapseqno;
+
+    /* dictionary to consult */
+    Oid            mapdict BKI_LOOKUP(pg_ts_dict);
 } FormData_pg_ts_config_map;

 typedef FormData_pg_ts_config_map *Form_pg_ts_config_map;
diff --git a/src/include/catalog/pg_ts_dict.dat b/src/include/catalog/pg_ts_dict.dat
index cda413a..f9d50da 100644
--- a/src/include/catalog/pg_ts_dict.dat
+++ b/src/include/catalog/pg_ts_dict.dat
@@ -14,7 +14,6 @@

 { oid => '3765',
   descr => 'simple dictionary: just lower case and check for stopword',
-  dictname => 'simple', dictnamespace => 'PGNSP', dictowner => 'PGUID',
-  dicttemplate => '3727', dictinitoption => '_null_' },
+  dictname => 'simple', dicttemplate => 'simple', dictinitoption => '_null_' },

 ]
diff --git a/src/include/catalog/pg_ts_dict.h b/src/include/catalog/pg_ts_dict.h
index b77c422..be7f016 100644
--- a/src/include/catalog/pg_ts_dict.h
+++ b/src/include/catalog/pg_ts_dict.h
@@ -28,14 +28,24 @@
  */
 CATALOG(pg_ts_dict,3600,TSDictionaryRelationId)
 {
-    Oid            oid;            /* oid */
-    NameData    dictname;        /* dictionary name */
-    Oid            dictnamespace;    /* name space */
-    Oid            dictowner;        /* owner */
-    Oid            dicttemplate;    /* dictionary's template */
+    /* oid */
+    Oid            oid;
+
+    /* dictionary name */
+    NameData    dictname;
+
+    /* name space */
+    Oid            dictnamespace BKI_DEFAULT(PGNSP);
+
+    /* owner */
+    Oid            dictowner BKI_DEFAULT(PGUID);
+
+    /* dictionary's template */
+    Oid            dicttemplate BKI_LOOKUP(pg_ts_template);

 #ifdef CATALOG_VARLEN            /* variable-length fields start here */
-    text        dictinitoption; /* options passed to dict_init() */
+    /* options passed to dict_init() */
+    text        dictinitoption;
 #endif
 } FormData_pg_ts_dict;

diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
index d129583..2495ed6 100644
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -54,7 +54,7 @@
   typname => 'name', typlen => 'NAMEDATALEN', typbyval => 'f',
   typcategory => 'S', typelem => 'char', typinput => 'namein',
   typoutput => 'nameout', typreceive => 'namerecv', typsend => 'namesend',
-  typalign => 'c', typcollation => '950' },
+  typalign => 'c', typcollation => 'C' },
 { oid => '20', array_type_oid => '1016',
   descr => '~18 digit integer, 8-byte storage',
   typname => 'int8', typlen => '8', typbyval => 'FLOAT8PASSBYVAL',
@@ -85,7 +85,7 @@
   typname => 'text', typlen => '-1', typbyval => 'f', typcategory => 'S',
   typispreferred => 't', typinput => 'textin', typoutput => 'textout',
   typreceive => 'textrecv', typsend => 'textsend', typalign => 'i',
-  typstorage => 'x', typcollation => '100' },
+  typstorage => 'x', typcollation => 'default' },
 { oid => '26', array_type_oid => '1028',
   descr => 'object identifier(oid), maximum 4 billion',
   typname => 'oid', typlen => '4', typbyval => 't', typcategory => 'N',
@@ -115,22 +115,22 @@
 # NB: OIDs assigned here must match the BKI_ROWTYPE_OID declarations
 { oid => '71',
   typname => 'pg_type', typlen => '-1', typbyval => 'f', typtype => 'c',
-  typcategory => 'C', typrelid => '1247', typinput => 'record_in',
+  typcategory => 'C', typrelid => 'pg_type', typinput => 'record_in',
   typoutput => 'record_out', typreceive => 'record_recv',
   typsend => 'record_send', typalign => 'd', typstorage => 'x' },
 { oid => '75',
   typname => 'pg_attribute', typlen => '-1', typbyval => 'f', typtype => 'c',
-  typcategory => 'C', typrelid => '1249', typinput => 'record_in',
+  typcategory => 'C', typrelid => 'pg_attribute', typinput => 'record_in',
   typoutput => 'record_out', typreceive => 'record_recv',
   typsend => 'record_send', typalign => 'd', typstorage => 'x' },
 { oid => '81',
   typname => 'pg_proc', typlen => '-1', typbyval => 'f', typtype => 'c',
-  typcategory => 'C', typrelid => '1255', typinput => 'record_in',
+  typcategory => 'C', typrelid => 'pg_proc', typinput => 'record_in',
   typoutput => 'record_out', typreceive => 'record_recv',
   typsend => 'record_send', typalign => 'd', typstorage => 'x' },
 { oid => '83',
   typname => 'pg_class', typlen => '-1', typbyval => 'f', typtype => 'c',
-  typcategory => 'C', typrelid => '1259', typinput => 'record_in',
+  typcategory => 'C', typrelid => 'pg_class', typinput => 'record_in',
   typoutput => 'record_out', typreceive => 'record_recv',
   typsend => 'record_send', typalign => 'd', typstorage => 'x' },

@@ -150,21 +150,21 @@
   typcategory => 'S', typinput => 'pg_node_tree_in',
   typoutput => 'pg_node_tree_out', typreceive => 'pg_node_tree_recv',
   typsend => 'pg_node_tree_send', typalign => 'i', typstorage => 'x',
-  typcollation => '100' },
+  typcollation => 'default' },
 { oid => '3361', oid_symbol => 'PGNDISTINCTOID',
   descr => 'multivariate ndistinct coefficients',
   typname => 'pg_ndistinct', typlen => '-1', typbyval => 'f',
   typcategory => 'S', typinput => 'pg_ndistinct_in',
   typoutput => 'pg_ndistinct_out', typreceive => 'pg_ndistinct_recv',
   typsend => 'pg_ndistinct_send', typalign => 'i', typstorage => 'x',
-  typcollation => '100' },
+  typcollation => 'default' },
 { oid => '3402', oid_symbol => 'PGDEPENDENCIESOID',
   descr => 'multivariate dependencies',
   typname => 'pg_dependencies', typlen => '-1', typbyval => 'f',
   typcategory => 'S', typinput => 'pg_dependencies_in',
   typoutput => 'pg_dependencies_out', typreceive => 'pg_dependencies_recv',
   typsend => 'pg_dependencies_send', typalign => 'i', typstorage => 'x',
-  typcollation => '100' },
+  typcollation => 'default' },
 { oid => '32', oid_symbol => 'PGDDLCOMMANDOID',
   descr => 'internal type for passing CollectedCommand',
   typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't',
@@ -269,14 +269,14 @@
   typinput => 'bpcharin', typoutput => 'bpcharout', typreceive => 'bpcharrecv',
   typsend => 'bpcharsend', typmodin => 'bpchartypmodin',
   typmodout => 'bpchartypmodout', typalign => 'i', typstorage => 'x',
-  typcollation => '100' },
+  typcollation => 'default' },
 { oid => '1043', array_type_oid => '1015',
   descr => 'varchar(length), non-blank-padded string, variable storage length',
   typname => 'varchar', typlen => '-1', typbyval => 'f', typcategory => 'S',
   typinput => 'varcharin', typoutput => 'varcharout',
   typreceive => 'varcharrecv', typsend => 'varcharsend',
   typmodin => 'varchartypmodin', typmodout => 'varchartypmodout',
-  typalign => 'i', typstorage => 'x', typcollation => '100' },
+  typalign => 'i', typstorage => 'x', typcollation => 'default' },
 { oid => '1082', array_type_oid => '1182', descr => 'date',
   typname => 'date', typlen => '4', typbyval => 't', typcategory => 'D',
   typinput => 'date_in', typoutput => 'date_out', typreceive => 'date_recv',
diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h
index 24d114b..4a24739 100644
--- a/src/include/catalog/pg_type.h
+++ b/src/include/catalog/pg_type.h
@@ -99,7 +99,7 @@ CATALOG(pg_type,1247,TypeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71,TypeRelati
     char        typdelim BKI_DEFAULT(',');

     /* associated pg_class OID if a composite type, else 0 */
-    Oid            typrelid BKI_DEFAULT(0) BKI_ARRAY_DEFAULT(0);
+    Oid            typrelid BKI_DEFAULT(0) BKI_ARRAY_DEFAULT(0) BKI_LOOKUP(pg_class);

     /*
      * If typelem is not 0 then it identifies another row in pg_type. The
@@ -215,7 +215,7 @@ CATALOG(pg_type,1247,TypeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71,TypeRelati
      * DEFAULT_COLLATION_OID) for collatable base types, possibly some other
      * OID for domains over collatable types
      */
-    Oid            typcollation BKI_DEFAULT(0);
+    Oid            typcollation BKI_DEFAULT(0) BKI_LOOKUP(pg_collation);

 #ifdef CATALOG_VARLEN            /* variable-length fields start here */


Re: Why don't we have a small reserved OID range for patch revisions?

From
John Naylor
Date:
On Tue, Mar 12, 2019 at 5:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This seems committable from my end --- any further comments?

I gave it a read and it looks good to me, but I haven't tried to run it.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why don't we have a small reserved OID range for patch revisions?

From
Tom Lane
Date:
John Naylor <john.naylor@2ndquadrant.com> writes:
> On Tue, Mar 12, 2019 at 5:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This seems committable from my end --- any further comments?

> I gave it a read and it looks good to me, but I haven't tried to run it.

Thanks for checking.  I've pushed both patches now.

I noticed while looking at the pg_class data that someone had stuck in a
hack to make genbki.pl substitute for "PGHEAPAM", which AFAICS is just
following the bad old precedent of PGNSP and PGUID.  I got rid of that
in favor of using the already-existing BKI_LOOKUP(pg_am) mechanism.
Maybe someday we should try to get rid of PGNSP and PGUID too, although
there are stumbling blocks in the way of both:

* PGNSP is also substituted for in the bodies of some SQL procedures.

* Replacing PGUID with the actual name of the bootstrap superuser is a
bit problematic because that name isn't necessarily "postgres".  We
could probably make it work, but I'm not convinced it'd be any less
confusing than the existing special-case behavior is.

Anyway I think we're basically done here.  There's some additional
cleanup that could possibly be done, like removing the hard-wired
references to OID 1 in initdb.c.  But I'm having a hard time convincing
myself that it's worth the trouble, except maybe for the question of
information_schema.sql's hard-wired type OIDs.  Even there, it's
certainly possible for a patch to use a regtype constant even if
the existing code doesn't.

            regards, tom lane