Re: [HACKERS] Cutting initdb's runtime (Perl question embedded) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
Date
Msg-id 6312.1492100010@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)  (Andreas Karlsson <andreas@proxel.se>)
Responses Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)  (Andres Freund <andres@anarazel.de>)
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)  (Andreas Karlsson <andreas@proxel.se>)
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)  (ilmari@ilmari.org (Dagfinn Ilmari Mannsåker))
List pgsql-hackers
Andreas Karlsson <andreas@proxel.se> writes:
> Here is my proof of concept patch. It does basically the same thing as 
> Andres's patch except that it handles quoted values a bit better and 
> does not try to support anything other than the regproc type.

> The patch speeds up initdb without fsync from 0.80 seconds to 0.55 
> seconds, which is a nice speedup, while adding a negligible amount of 
> extra work on compilation.

I've pushed this with some mostly-cosmetic adjustments:

* created a single subroutine that understands how to split DATA lines,
rather than having several copies of the regex

* rearranged the code so that the data structure returned by
Catalog::Catalogs() isn't scribbled on (which was already
happening before your patch, but it seemed pretty ugly to me)

* stripped out the bootstrap-time name lookup code from all of reg*
not just regproc.

There's certainly lots more that could be done in the genbki code,
but I think all we can justify at this stage of the development
cycle is to get the low-hanging fruit for testing speedups.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] SUBSCRIPTIONS and pg_upgrade
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] pg_statistic_ext.staenabled might not be the best column name