Re: Bootstrap DATA is a pita - Mailing list pgsql-hackers

From Caleb Welton
Subject Re: Bootstrap DATA is a pita
Date
Msg-id 86F3B052-8527-4E06-A392-412D72305858@pivotal.io
Whole thread Raw
In response to Re: Bootstrap DATA is a pita  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Bootstrap DATA is a pita  (Caleb Welton <cwelton@pivotal.io>)
List pgsql-hackers
Makes sense.

During my own prototyping what I did was generate the sql statements via sql querying the existing catalog.  Way easier
thanhand writing 1000+ function definitions and not difficult to modify for future changes.  As affirmed that it was
veryeasy to adapt my existing sql to account for some of the newer features in master.  

The biggest challenge was establishing a sort order that ensures both a unique ordering and that the dependencies
neededfor SQL functions have been processed before trying to define them.  Which effects about 4/1000 functions based
ona natural oid ordering.   

> On Dec 11, 2015, at 11:43 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Caleb Welton wrote:
>> I'm happy working these ideas forward if there is interest.
>>
>> Basic design proposal is:
>>  - keep a minimal amount of bootstrap to avoid intrusive changes to core
>> components
>>  - Add capabilities of creating objects with specific OIDs via DDL during
>> initdb
>>  - Update the caching/resolution mechanism for builtin functions to be
>> more dynamic.
>>  - Move as much of bootstrap as possible into SQL files and create catalog
>> via DDL
>
> I think the point we got stuck last time at was deciding on a good
> format for the data coming from the DATA lines.  One of the objections
> raised for formats such as JSON is that it's trivial for "git merge" (or
> similar tools) to make a mistake because object-end/object-start lines
> are all identical.  And as for the SQL-format version, the objection was
> that it's hard to modify the lines en-masse when modifying the catalog
> definition (new column, etc).  Ideally we would like a format that can
> be bulk-edited without too much trouble.
>
> A SQL file would presumably not have the merge issue, but mass-editing
> would be a pain.
>
> Crazy idea: we could just have a CSV file which can be loaded into a
> table for mass changes using regular DDL commands, then dumped back from
> there into the file.  We already know how to do these things, using
> \copy etc.  Since CSV uses one line per entry, there would be no merge
> problems either (or rather: all merge problems would become conflicts,
> which is what we want.)
>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: [sqlsmith] Failed to generate plan on lateral subqueries
Next
From: Tom Lane
Date:
Subject: Re: Remove array_nulls?