Re: WIP: a way forward on bootstrap data - Mailing list pgsql-hackers

From John Naylor
Subject Re: WIP: a way forward on bootstrap data
Date
Msg-id CAJVSVGXKsiwMVbtx-nGqPeFzsCEWmFs5wFmepEawdzAyWhLO-Q@mail.gmail.com
Whole thread Raw
In response to Re: WIP: a way forward on bootstrap data  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On 1/13/18, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

> I'm afraid a key value system would invite writing the attributes in
> random order and create a mess over time.

A developer can certainly write them in random order, and it will
still work. However, in patch 0002 I have a script to enforce a
standard appearance. Of course, for it to work, you have to run it. I
describe it, if rather tersely, in the README changes in patch 0008.
Since several people have raised this concern, I will go into a bit
more depth here. Perhaps I should reuse some of this language for the
README to improve it.

src/include/catalog/rewrite_dat.pl knows where to find the schema of
each catalog, namely the pg_*.h header, accessed via ParseHeader() in
Catalog.pm. It writes key/value pairs in the order found in the
schema:

{ key_1 => 'value_1', key_2 => 'value_2', ..., key_n => 'value_n' }

The script also has an array of four hard-coded metadata fields: oid,
oid_symbol, descr, and shdescr. If any of these are present, they will
go on their own line first, in the order given:

{ oid => 9999, oid_symbol => 'FOO_OID', descr => 'comment on foo',
  key_1 => 'value_1', key_2 => 'value_2', ..., key_n => 'value_n' }

> I don't think I like this.  I know pg_proc.h is a pain to manage, but at
> least right now it's approachable programmatically.  I recently proposed
> to patch to replace the columns proisagg and proiswindow with a combined
> column prokind.  I could easily write a small Perl script to make that
> change in pg_proc.h, because the format is easy to parse and has one
> line per entry.  With this new format, that approach would no longer
> work, and I don't know what would replace it.

I've attached four diffs/patches to walk through how you would replace
the columns proisagg and proiswindow with a combined column prokind.

Patch 01: Add new prokind column to pg_proc.h, with a default of 'n'.
In many cases, this is all you would have to do, as far as
bootstrapping is concerned.

Diff 02: This is a one-off script diffed against rewrite_dat.pl. In
rewrite_dat.pl, I have a section with this comment, and this is where
I put the one-off code:

# Note: This is also a convenient place to do one-off
# bulk-editing.

(I haven't documented this with explicit examples, so I'll have to remedy that)

You would run it like this:

cd src/include/catalog
perl -I ../../backend/catalog/  rewrite_dat_with_prokind.pl  pg_proc.dat

While reading pg_proc.dat, the default value for prokind is added
automatically. We inspect proisagg and proiswindow, and change prokind
accordingly. pg_proc.dat now has all three columns, prokind, proisagg,
and proiswindow.

Patch 03: Remove old columns from pg_proc.h

Now we run the standard rewrite:

perl -I ../../backend/catalog/ rewrite_dat.pl pg_proc.dat

Any values not found in the schema will simply not be written to
pg_proc.dat, so the old columns are now gone.

The result is found in patch 04.
--

Note: You could theoretically also load the source data into tables,
do the updates with SQL, and dump back out again. I made some progress
with this method, but it's not complete. I think the load and dump
steps add too much complexity for most use cases, but it's a
possibility.


-John Naylor

Attachment

pgsql-hackers by date:

Previous
From: Marina Polyakova
Date:
Subject: Re: master make check fails on Solaris 10
Next
From: John Naylor
Date:
Subject: Re: WIP: a way forward on bootstrap data