Re: [HACKERS] Cutting initdb's runtime (Perl question embedded) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
Date
Msg-id 9244.1492106743@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2017-04-13 12:56:14 -0400, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> Cool.  I wonder if we also should remove AtEOXact_CatCache()'s
>>> cross-checks - the resowner replacement has been in place for a while,
>>> and seems robust enough.  They're now the biggest user of time.

>> Hm, biggest user of time in what workload?  I've not noticed that
>> function particularly.

> Just initdb.  I presume it's because the catcaches will frequently be
> relatively big there.

Hm.  That ties into something I was looking at yesterday.  The only
reason that function is called so much is that bootstrap mode runs a
separate transaction for *each line of the bki file* (cf do_start,
do_end in bootparse.y).  Which seems pretty silly.  I experimented
with collapsing all the transactions for consecutive DATA lines into
one transaction, but couldn't immediately make it work due to memory
management issues.  I didn't try collapsing the entire run into a
single transaction, but maybe that would actually be easier, though
no doubt more wasteful of memory.

>> I agree that it doesn't seem like we need to spend a lot of time
>> cross-checking there, though.  Maybe keep the code but #ifdef it
>> under some nondefault debugging symbol.

> Hm, if we want to keep it, maybe tie it to CLOBBER_CACHE_ALWAYS or such,
> so it gets compiled at least sometimes? Not a great fit, but ...

Don't like that, because CCA is by definition not the normal cache
behavior.  It would make a bit of sense to tie it to CACHEDEBUG,
but as you say, it'd never get tested normally if we do that.

On the whole, though, we may be looking at diminishing returns here.
I just did some "perf" measurement of the overall "initdb" cycle,
and what I'm seeing suggests that bootstrap mode as such is now a
pretty small fraction of the overall cycle:

+   51.07%     0.01%            28  postgres         postgres                                      [.] PostgresMain
                # 
...
+   13.52%     0.00%             0  postgres         postgres                                      [.]
AuxiliaryProcessMain             # 

That says that the post-bootstrap steps are now the bulk of the time,
which agrees with naked-eye observation.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] Undefined psql variables
Next
From: Magnus Hagander
Date:
Subject: Re: [HACKERS] pg_upgrade vs extension upgrades