Re: Bootstrap DATA is a pita - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Bootstrap DATA is a pita
Date
Msg-id 20150221164309.GA2037@awork2.anarazel.de
Whole thread Raw
In response to Re: Bootstrap DATA is a pita  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bootstrap DATA is a pita  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 2015-02-21 11:34:09 -0500, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2015-02-20 22:19:54 -0500, Peter Eisentraut wrote:
> >> On 2/20/15 8:46 PM, Josh Berkus wrote:
> >>> Or what about just doing CSV?
> 
> >> I don't think that would actually address the problems.  It would just
> >> be the same format as now with different delimiters.
> 
> > Yea, we need hierarchies and named keys.
> 
> Yeah.  One thought though is that I don't think we need the "data" layer
> in your proposal; that is, I'd flatten the representation to something
> more like
> 
>      {
>      oid => 2249,
>      oiddefine => 'CSTRINGOID',
>      typname => 'cstring',
>          typlen => -2,
>          typbyval => 1,
>          ...
>      }

I don't really like that - then stuff like oid, description, comment (?)
have to not conflict with any catalog columns. I think it's easier to
have them separate.

> This will be easier to edit, either manually or programmatically I think.
> The code that turns it into a .bki file will need to know the exact set
> of columns in each system catalog, but it would have had to know that
> anyway I believe, if you're expecting it to insert default values.

There'll need to be some awareness of columns, sure. But I think
programatically editing the values will still be simpler if you don't
need to discern whether a key is a column or some genbki specific value.

> Ideally the column defaults could come from BKI_ macros in the catalog/*.h
> files; it would be good if we could keep those files as the One Source of
> Truth for catalog schema info, even as we split out the initial data.

Hm, yea.

One thing I was considering was to do the regtype and regproc lookups
directly in the tool. That'd have two advantages: 1) it'd make it
possible to refer to typenames in pg_proc, 2) It'd be much faster. Right
now most of initdb's time is doing syscache lookups during bootstrap,
because it can't use indexes... A simple hash lookup during bki
generation could lead to quite measurable savings during lookup.

We could then even rip the bootstrap code out of regtypein/regprocin...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bootstrap DATA is a pita
Next
From: Tom Lane
Date:
Subject: Re: Bootstrap DATA is a pita