Re: Documentation for bootstrap data conversion - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Documentation for bootstrap data conversion
Date
Msg-id 1289.1523046598@sss.pgh.pa.us
Whole thread Raw
In response to Re: Documentation for bootstrap data conversion  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> Quick skim only:

> "developers" here could possibly be understood to be any sort of
> developer, rather than postgres ones.  Perhaps just say "But the
> structure of the catalogs can change between major versions."?

OK.

> This sounds like an exhaustive list of what genbki.h allows - that seems
> likely to get out of date...

I'll put in "for example".

>> +# LC_COLLATE and LC_CTYPE will be replaced at initdb time with user choices
>> +# that might contain non-word characters, so we must double-quote them.

> Hm. Couldn't we get rid of that requirement and do the escaping in
> genbki? Seems awkward, failure prone (people will forget and it'll often
> work during development) and unnecessary in the new format.

Well, only if you want genbki.pl to embed knowledge of what initdb will
substitute for, which seems a bit outside its charter.

>> +    <listitem>
>> +     <para>
>> +      Null values are represented by <literal>_null_</literal>.
>> +     </para>
>> +    </listitem>

> wonder if it'd be more natural to use $null or such for this kind of thing.

Considering we're eval'ing these structs, I think that's just an
invitation to trouble :-)

>> +       <listitem>
>> +        <para>
>> +         If the catalog's <literal>.h</literal> file specifies a default
>> +         value for a column, and a data entry has that same
>> +         value, <filename>reformat_dat_file.pl</filename> will omit it from
>> +         the data file.  This keeps the data representation compact.
>> +        </para>
>> +       </listitem>

> This'll be fun if we ever decide to change a default :)

Well, see recipe for how to do that, a bit further down.

>> +    Pre-loaded catalog rows must have preassigned OIDs if there are OID
>> +    references to them in other pre-loaded rows.

> Why is that?

Uh, because otherwise we don't know what to put in?

Perhaps genbki could be made to do OID assignment, but that's not in its
charter right now, and I'm not sure it'd be an improvement.  There would
still be OIDs that have to be assigned during the bootstrap run.

>> +    In practice we usually preassign OIDs for all or none of the pre-loaded
>> +    rows in a given catalog, even if only some of them are actually
>> +    cross-referenced.

> I think we'd reduce the pain of maintaining uncommitted patches across a
> few CFs if this were a more relaxed rule.

Meh ... I'm just documenting the existing state of affairs here.  In any
case, I think once a function is committed it's a good thing if it has
a stable OID.  There are probably people depending on that externally
for fastpath calls.  (In fact, I wonder why we don't simplify libpq to
assume it knows the OIDs of loread et al, rather than painfully looking
them up.  If they haven't changed since 1996, they aren't going to.)

> Might be worthwhile to not (or fix) that one has to be inside the
> src/include/catalog/ directory to run it?

Don't much care, but if you do, fix away.

            regards, tom lane


pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: BUG #14941: Vacuum crashes
Next
From: Alvaro Herrera
Date:
Subject: Re: Vacuum: allow usage of more than 1GB of work mem