Home > mailing lists

Re: initdb / bootstrap design - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: initdb / bootstrap design
Date	February 21, 2022 00:44:39
Msg-id	20220220214439.bhc35hhbaub6dush@alap3.anarazel.de Whole thread Raw
In response to	Re: initdb / bootstrap design (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Hi,

On 2022-02-19 20:46:26 -0500, Tom Lane wrote:
> I tried it like that (full patch attached) and the results are intensely
> disappointing.  On my Mac laptop, the time needed for 50 iterations of
> initdb drops from 16.8 sec to 16.75 sec.

Hm. I'd hoped for at least a little bit bigger win. But I think it enables
more, see below:


> Not sure that this is worth pursuing any further.

I experimented with moving all the bootstrapping into --boot mode and got it
working. Albeit definitely with a few hacks (more below).

While I had hoped for a bit more of a win, it's IMO a nice improvement.
Executing 10 initdb -N --wal-segsize 1 in a loop:

HEAD:

  assert:
  8.06user 1.17system 0:09.25elapsed 99%CPU (0avgtext+0avgdata 91724maxresident)k
  0inputs+549280outputs (40major+99824minor)pagefaults 0swaps

  opt:
  2.89user 0.99system 0:04.81elapsed 80%CPU (0avgtext+0avgdata 88864maxresident)k
  0inputs+549280outputs (40major+99792minor)pagefaults 0swaps


default to lz4:

  assert:
  7.61user 1.03system 0:08.69elapsed 99%CPU (0avgtext+0avgdata 91508maxresident)k
  0inputs+546400outputs (42major+99551minor)pagefaults 0swaps

  opt:
  2.55user 0.94system 0:03.49elapsed 99%CPU (0avgtext+0avgdata 88816maxresident)k
  0inputs+546400outputs (40major+99551minor)pagefaults 0swaps


bootstrap replace:

  assert:
  7.42user 1.00system 0:08.52elapsed 98%CPU (0avgtext+0avgdata 91656maxresident)k
  0inputs+546400outputs (40major+97737minor)pagefaults 0swaps

  opt:
  2.49user 0.98system 0:03.49elapsed 99%CPU (0avgtext+0avgdata 88700maxresident)k
  0inputs+546400outputs (40major+97728minor)pagefaults 0swaps


everything in bootstrap:

  assert:
  6.31user 0.94system 0:07.35elapsed 98%CPU (0avgtext+0avgdata 97812maxresident)k
  0inputs+547360outputs (30major+88617minor)pagefaults 0swaps

  opt:
  2.42user 0.85system 0:03.28elapsed 99%CPU (0avgtext+0avgdata 94572maxresident)k
  0inputs+547360outputs (30major+83712minor)pagefaults 0swaps


optimize WAL in bootstrap:
  assert:
  6.26user 0.96system 0:07.29elapsed 99%CPU (0avgtext+0avgdata 97844maxresident)k
  0inputs+547360outputs (30major+88586minor)pagefaults 0swaps

  opt:
  2.43user 0.80system 0:03.24elapsed 99%CPU (0avgtext+0avgdata 94436maxresident)k
  0inputs+547360outputs (30major+83664minor)pagefaults 0swaps


remote isatty in bootstrap:

  assert:
  6.15user 0.83system 0:06.99elapsed 99%CPU (0avgtext+0avgdata 97832maxresident)k
  0inputs+465120outputs (30major+88559minor)pagefaults 0swaps

  opt:
  2.28user 0.85system 0:03.14elapsed 99%CPU (0avgtext+0avgdata 94604maxresident)k
  0inputs+465120outputs (30major+83728minor)pagefaults 0swaps


That's IMO not bad.

On windows I see a higher gains, which makes sense, because filesystem IO is
slower. Freebsd as well, but the variance is oddly high, so I might be doing
something wrong.


The main reason I like this however isn't the speedup itself, but that after
this initdb doesn't depend on single user mode at all anymore.


About the prototype:

- Most of the bootstrap SQL is executed from bootstrap.c itself. But some
  still comes from the client. E.g. password, a few information_schema
  details and the database / authid changes.

- To execute the sql I mostly used extension.c's
  read_whole_file()/execute_sql_string(). But VACUUM, CREATE DATABASE require
  all the transactional hacks in portal.c etc. So I wrapped
  exec_simple_query() for that phase.

  Might be better to just call vacuum.c / database.c directly.

- for indexed relcache access to work the phase of
  RelationCacheInitializePhase3() that's initially skipped needs to be
  executed. I hacked that up by adding a RelationCacheInitializePhase3b() that
  bootstrap.c can call, but that's obviously too ugly to live.

- InvalidateSystemCaches() is needed after bki processing. Otherwise I see an
  "row is too big:" error. Didn't investigate yet.

- I definitely removed some validation that we'd probably want. But that seems
  something to care about later...

- 0004 prevents a fair bit of WAL from being written. While XLogInsert did
  some of that, it didn't block FPIs, which obviously are bulky. This reduces
  WAL from ~5MB to ~100kB.


There's quite a bit of further speedup potential:

- One bottleneck, particularly in optimized mode, is the handling of huge node
  trees for views. strToNode() and nodeRead() are > 10% alone

- Enabling index access sometime during the postgres.bki processing would make
  invalidation handling for subsequent indexes faster. Or maybe we can disable
  a few more invalidations. Inval processing is >10%

- more than 10% (assert) / 7% (optimized) is spent in
  compute_scalar_stats()->qsort_arg(). Something seems off with that to me.


Completely crazy?


Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

From: Justin Pryzby
Date: 20 February 2022, 23:57:33
Subject: Re: Adding CI to our tree (ccache)

From: Andres Freund
Date: 21 February 2022, 01:15:37
Subject: Re: do only critical work during single-user vacuum?

Re: initdb / bootstrap design - Mailing list pgsql-hackers

Attachment

Previous

Next