Home > mailing lists
Re: Why don't we have a small reserved OID range for patch revisions? - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Why don't we have a small reserved OID range for patch revisions?
Date	March 11, 2019 21:36:24
Msg-id	24698.1552340184@sss.pgh.pa.us Whole thread Raw
In response to	Re: Why don't we have a small reserved OID range for patch revisions? (John Naylor <john.naylor@2ndquadrant.com>)
Responses	Re: Why don't we have a small reserved OID range for patch revisions? Re: Why don't we have a small reserved OID range for patch revisions?
List	pgsql-hackers
Tree view
John Naylor <john.naylor@2ndquadrant.com> writes:
> Now it looks like:
> perl  renumber_oids.pl  --first-mapped-oid 8000 --last-mapped-oid 8999
>  --first-target-oid 2000  *.dat
> To prevent a maintenance headache, I didn't copy any of the formatting
> logic over. You'll also have to run reformat_dat_files.pl afterwards
> to restore that. It seems to work, but I haven't tested thoroughly.

I didn't like the use of Data::Dumper, because it made it quite impossible
to check what the script had done by eyeball.  After some thought
I concluded that we could probably just apply the changes via
search-and-replace, which is pretty ugly and low-tech but it leads to
easily diffable results, whether or not the initial state is exactly
what reformat_dat_files would produce.

I also changed things so that the OID mapping is computed before we start
changing any files, because as it stood the objects would get renumbered
in a pretty random order; and I renamed one of the switches so they all
have unique one-letter abbreviations.

Experimenting with this, I realized that it couldn't renumber OIDs that
are defined in .h files rather than .dat files, which is a serious
deficiency, but given the search-and-replace implementation it's not too
hard to fix up the .h files as well.  So I did that, and removed the
expectation that the target files would be listed on the command line;
that seems more likely to be a foot-gun than to do anything useful.

I've successfully done check-world after renumbering every OID above
4000 to somewhere else.  I also tried renumbering everything below
4000, which unsurprisingly blew up because there are various catalog
columns we haven't fixed to use symbolic OIDs.  (The one that initdb
immediately trips over is pg_database.dattablespace.)  I'm not sure
if it's worth the trouble to make that totally clean, but I suspect
we ought to at least mop up text-search references to be symbolic.
That's material for a separate patch though.

This seems committable from my end --- any further comments?

            regards, tom lane

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 3c5830b..b6038b9 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -388,8 +388,25 @@
     to see which ones do not appear.  You can also use
     the <filename>duplicate_oids</filename> script to check for mistakes.
     (<filename>genbki.pl</filename> will assign OIDs for any rows that
-    didn't get one hand-assigned to them and also detect duplicate OIDs
-    at compile time.)
+    didn't get one hand-assigned to them, and it will also detect duplicate
+    OIDs at compile time.)
+   </para>
+
+   <para>
+    When choosing OIDs for a patch that is not expected to be committed
+    immediately, best practice is to use a group of more-or-less
+    consecutive OIDs starting with some random choice in the range
+    8000—9999.  This minimizes the risk of OID collisions with other
+    patches being developed concurrently.  To keep the 8000—9999
+    range free for development purposes, after a patch has been committed
+    to the master git repository its OIDs should be renumbered into
+    available space below that range.  Typically, this will be done
+    near the end of each development cycle, moving all OIDs consumed by
+    patches committed in that cycle at the same time.  The script
+    <filename>renumber_oids.pl</filename> can be used for this purpose.
+    If an uncommitted patch is found to have OID conflicts with some
+    recently-committed patch, <filename>renumber_oids.pl</filename> may
+    also be useful for recovering from that situation.
    </para>

    <para>
diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index 3bf308f..368b1de 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -87,6 +87,8 @@ sub ParseHeader
         }

         # Push the data into the appropriate data structure.
+        # Caution: when adding new recognized OID-defining macros,
+        # also update src/include/catalog/renumber_oids.pl.
         if (/^DECLARE_TOAST\(\s*(\w+),\s*(\d+),\s*(\d+)\)/)
         {
             push @{ $catalog{toasting} },
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index 833fad1..f253613 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -4,6 +4,9 @@
  *      This file provides some definitions to support indexing
  *      on system catalogs
  *
+ * Caution: all #define's with numeric values in this file had better be
+ * object OIDs, else renumber_oids.pl might change them inappropriately.
+ *
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
diff --git a/src/include/catalog/renumber_oids.pl b/src/include/catalog/renumber_oids.pl
new file mode 100755
index 0000000..8a07340
--- /dev/null
+++ b/src/include/catalog/renumber_oids.pl
@@ -0,0 +1,291 @@
+#!/usr/bin/perl
+#----------------------------------------------------------------------
+#
+# renumber_oids.pl
+#    Perl script that shifts a range of OIDs in the Postgres catalog data
+#    to a different range, skipping any OIDs that are already in use.
+#
+#    Note: This does not reformat the .dat files, so you may want
+#    to run reformat_dat_file.pl afterwards.
+#
+# Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/include/catalog/renumber_oids.pl
+#
+#----------------------------------------------------------------------
+
+use strict;
+use warnings;
+
+use FindBin;
+use Getopt::Long;
+
+# Must run in src/include/catalog
+chdir $FindBin::RealBin or die "could not cd to $FindBin::RealBin: $!\n";
+
+use lib "$FindBin::RealBin/../../backend/catalog/";
+use Catalog;
+
+# We'll need this number.
+my $FirstGenbkiObjectId =
+  Catalog::FindDefinedSymbol('access/transam.h', '..', 'FirstGenbkiObjectId');
+
+# Process command line switches.
+my $output_path      = '';
+my $first_mapped_oid = 0;
+my $last_mapped_oid  = $FirstGenbkiObjectId - 1;
+my $target_oid       = 0;
+
+GetOptions(
+    'output=s'           => \$output_path,
+    'first-mapped-oid=i' => \$first_mapped_oid,
+    'last-mapped-oid=i'  => \$last_mapped_oid,
+    'target-oid=i'       => \$target_oid) || usage();
+
+# Sanity check arguments.
+die "Unexpected non-switch arguments.\n" if @ARGV;
+die "--first-mapped-oid must be specified.\n"
+  if $first_mapped_oid <= 0;
+die "Empty mapped OID range.\n"
+  if $last_mapped_oid < $first_mapped_oid;
+die "--target-oid must be specified.\n"
+  if $target_oid <= 0;
+die "--target-oid must not be within mapped OID range.\n"
+  if $target_oid >= $first_mapped_oid && $target_oid <= $last_mapped_oid;
+
+# Make sure output_path ends in a slash.
+if ($output_path ne '' && substr($output_path, -1) ne '/')
+{
+    $output_path .= '/';
+}
+
+# Collect all the existing assigned OIDs (including those to be remapped).
+my @header_files = (glob("pg_*.h"), qw(indexing.h toasting.h));
+my $oids = Catalog::FindAllOidsFromHeaders(@header_files);
+
+# Hash-ify the existing OIDs for convenient lookup.
+my %oidhash;
+@oidhash{@$oids} = undef;
+
+# Select new OIDs for existing OIDs in the mapped range.
+# We do this first so that we preserve the ordering of the mapped OIDs
+# (for reproducibility's sake), and so that if we fail due to running out
+# of OID room, that happens before we've overwritten any files.
+my %maphash;
+my $next_oid = $target_oid;
+
+for (
+    my $mapped_oid = $first_mapped_oid;
+    $mapped_oid <= $last_mapped_oid;
+    $mapped_oid++)
+{
+    next if !exists $oidhash{$mapped_oid};
+    $next_oid++
+      while (
+        exists $oidhash{$next_oid}
+        || (   $next_oid >= $first_mapped_oid
+            && $next_oid <= $last_mapped_oid));
+    die "Reached FirstGenbkiObjectId before assigning all OIDs.\n"
+      if $next_oid >= $FirstGenbkiObjectId;
+    $maphash{$mapped_oid} = $next_oid;
+    $next_oid++;
+}
+
+die "There are no OIDs in the mapped range.\n" if $next_oid == $target_oid;
+
+# Read each .h file and write out modified data.
+foreach my $input_file (@header_files)
+{
+    $input_file =~ /(\w+)\.h$/
+      or die "Input file $input_file needs to be a .h file.\n";
+    my $catname = $1;
+
+    # Ignore generated *_d.h files.
+    next if $catname =~ /_d$/;
+
+    open(my $ifd, '<', $input_file) || die "$input_file: $!";
+
+    # Write output files to specified directory.
+    # Use a .tmp suffix, then rename into place, in case we're overwriting.
+    my $output_file     = "$output_path$catname.h";
+    my $tmp_output_file = "$output_file.tmp";
+    open my $ofd, '>', $tmp_output_file
+      or die "can't open $tmp_output_file: $!";
+    my $changed = 0;
+
+    # Scan the input file.
+    while (<$ifd>)
+    {
+        my $line = $_;
+
+        # Check for OID-defining macros that Catalog::ParseHeader knows about,
+        # and update OIDs as needed.
+        if ($line =~ m/^(DECLARE_TOAST\(\s*\w+,\s*)(\d+)(,\s*)(\d+)\)/)
+        {
+            my $oid2 = $2;
+            my $oid4 = $4;
+            if (exists $maphash{$oid2})
+            {
+                $oid2 = $maphash{$oid2};
+                my $repl = $1 . $oid2 . $3 . $oid4 . ")";
+                $line =~ s/^DECLARE_TOAST\(\s*\w+,\s*\d+,\s*\d+\)/$repl/;
+                $changed = 1;
+            }
+            if (exists $maphash{$oid4})
+            {
+                $oid4 = $maphash{$oid4};
+                my $repl = $1 . $oid2 . $3 . $oid4 . ")";
+                $line =~ s/^DECLARE_TOAST\(\s*\w+,\s*\d+,\s*\d+\)/$repl/;
+                $changed = 1;
+            }
+        }
+        elsif (
+            $line =~ m/^(DECLARE_(UNIQUE_)?INDEX\(\s*\w+,\s*)(\d+)(,\s*.+)\)/)
+        {
+            if (exists $maphash{$3})
+            {
+                my $repl = $1 . $maphash{$3} . $4 . ")";
+                $line =~
+                  s/^DECLARE_(UNIQUE_)?INDEX\(\s*\w+,\s*\d+,\s*.+\)/$repl/;
+                $changed = 1;
+            }
+        }
+        elsif ($line =~ m/^CATALOG\((\w+),(\d+),(\w+)\)/)
+        {
+            if (exists $maphash{$2})
+            {
+                my $repl =
+                  "CATALOG(" . $1 . "," . $maphash{$2} . "," . $3 . ")";
+                $line =~ s/^CATALOG\(\w+,\d+,\w+\)/$repl/;
+                $changed = 1;
+            }
+
+            if ($line =~ m/BKI_ROWTYPE_OID\((\d+),(\w+)\)/)
+            {
+                if (exists $maphash{$1})
+                {
+                    my $repl =
+                      "BKI_ROWTYPE_OID(" . $maphash{$1} . "," . $2 . ")";
+                    $line =~ s/BKI_ROWTYPE_OID\(\d+,\w+\)/$repl/;
+                    $changed = 1;
+                }
+            }
+        }
+
+        # In indexing.h and toasting.h only, check for #define SYM nnnn,
+        # and replace if within mapped range.
+        elsif ($line =~ m/^(\s*#\s*define\s+\w+\s+)(\d+)\b/)
+        {
+            if (($catname eq 'indexing' || $catname eq 'toasting')
+                && exists $maphash{$2})
+            {
+                my $repl = $1 . $maphash{$2};
+                $line =~ s/^\s*#\s*define\s+\w+\s+\d+\b/$repl/;
+                $changed = 1;
+            }
+        }
+
+        print $ofd $line;
+    }
+
+    close $ifd;
+    close $ofd;
+
+    # Avoid updating files if we didn't change them.
+    if ($changed || $output_path ne '')
+    {
+        rename $tmp_output_file, $output_file
+          or die "can't rename $tmp_output_file to $output_file: $!";
+    }
+    else
+    {
+        unlink $tmp_output_file
+          or die "can't unlink $tmp_output_file: $!";
+    }
+}
+
+# Likewise, read each .dat file and write out modified data.
+foreach my $input_file (glob("pg_*.dat"))
+{
+    $input_file =~ /(\w+)\.dat$/
+      or die "Input file $input_file needs to be a .dat file.\n";
+    my $catname = $1;
+
+    open(my $ifd, '<', $input_file) || die "$input_file: $!";
+
+    # Write output files to specified directory.
+    # Use a .tmp suffix, then rename into place, in case we're overwriting.
+    my $output_file     = "$output_path$catname.dat";
+    my $tmp_output_file = "$output_file.tmp";
+    open my $ofd, '>', $tmp_output_file
+      or die "can't open $tmp_output_file: $!";
+    my $changed = 0;
+
+    # Scan the input file.
+    while (<$ifd>)
+    {
+        my $line = $_;
+
+        # Check for oid => 'nnnn', and replace if within mapped range.
+        if ($line =~ m/\b(oid\s*=>\s*)'(\d+)'/)
+        {
+            if (exists $maphash{$2})
+            {
+                my $repl = $1 . "'" . $maphash{$2} . "'";
+                $line =~ s/\boid\s*=>\s*'\d+'/$repl/;
+                $changed = 1;
+            }
+        }
+
+        # Likewise for array_type_oid.
+        if ($line =~ m/\b(array_type_oid\s*=>\s*)'(\d+)'/)
+        {
+            if (exists $maphash{$2})
+            {
+                my $repl = $1 . "'" . $maphash{$2} . "'";
+                $line =~ s/\barray_type_oid\s*=>\s*'\d+'/$repl/;
+                $changed = 1;
+            }
+        }
+
+        print $ofd $line;
+    }
+
+    close $ifd;
+    close $ofd;
+
+    # Avoid updating files if we didn't change them.
+    if ($changed || $output_path ne '')
+    {
+        rename $tmp_output_file, $output_file
+          or die "can't rename $tmp_output_file to $output_file: $!";
+    }
+    else
+    {
+        unlink $tmp_output_file
+          or die "can't unlink $tmp_output_file: $!";
+    }
+}
+
+sub usage
+{
+    my $last = $FirstGenbkiObjectId - 1;
+    die <<EOM;
+Usage: renumber_oids.pl [--output PATH] --first-mapped-oid X [--last-mapped-oid Y] --target-oid Z
+
+Options:
+    --output PATH           output directory (default '.')
+    --first-mapped-oid X    first OID to be moved
+    --last-mapped-oid Y     last OID to be moved (default $last)
+    --target-oid Z          first OID to move to
+
+Catalog *.h and *.dat files are updated and written to the
+output directory; by default, this overwrites the input files.
+
+Caution: the output PATH will be interpreted relative to
+src/include/catalog, even if you start the script
+in some other directory.
+
+EOM
+}
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index 3796971..5ee628c 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -3,6 +3,9 @@
  * toasting.h
  *      This file provides some definitions to support creation of toast tables
  *
+ * Caution: all #define's with numeric values in this file had better be
+ * object OIDs, else renumber_oids.pl might change them inappropriately.
+ *
  *
  * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
diff --git a/src/tools/RELEASE_CHANGES b/src/tools/RELEASE_CHANGES
index edfd56e..920c6e9 100644
--- a/src/tools/RELEASE_CHANGES
+++ b/src/tools/RELEASE_CHANGES
@@ -58,11 +58,23 @@ in both HEAD and the branch.
     o doc/src/sgml/ref manual pages

 * Ports
-    o update config.guess and config.sub at the start of beta
-      (from http://savannah.gnu.org/projects/config)
     o update ports list in doc/src/sgml/installation.sgml
     o update platform-specific FAQ's, if needed

+
+Pre-Beta Tasks
+==============
+
+These things should be done at least once per development cycle.
+Typically we do them between feature freeze and start of beta test,
+but there may be reasons to do them at other times as well.
+
+* Renumber any manually-assigned OIDs between 8000 and 9999
+  to lower numbers, using renumber_oids.pl (see notes in bki.sgml)
+
+* Update config.guess and config.sub
+  (from http://savannah.gnu.org/projects/config)
+
 * Update inet/cidr data types with newest Bind patches
pgsql-hackers by date:
From: Andres Freund
Date: 11 March 2019, 21:29:34
Subject: Re: Pluggable Storage - Andres's take
From: Tom Lane
Date: 11 March 2019, 22:07:01
Subject: Re: proposal: variadic argument support for least, greatest function
Re: Why don't we have a small reserved OID range for patch revisions? - Mailing list pgsql-hackers

Previous

Next