initdb and share/postgresql.conf.sample - Mailing list pgsql-hackers

From Jeff Janes
Subject initdb and share/postgresql.conf.sample
Date
Msg-id CAMkU=1yuZDgA8iyJCGPSeXhs7VyUGeX0EJktJ28FPxGN-dsWoA@mail.gmail.com
Whole thread Raw
Responses Re: initdb and share/postgresql.conf.sample
List pgsql-hackers

In some of my git branches I have editorialized src/backend/utils/misc/postgresql.conf.sample to contain my configuration preferences for whatever it is that that branch is for testing.  Then this gets copied to share/postgresql.conf.sample during install and from there to data/postgresql.conf during initdb, and I don't need to remember to go make the necessary changes.

Am I insane to be doing this?  Is there a better way to handle this branch-specific configuration needs?

Anyway, I was recently astonished to discovery that the contents of share/postgresql.conf.sample during the initdb affected the performance of the server, even when the conf file was replaced with something else before the server was started up.  To make a very long story short, if share/postgresql.conf.sample is set up for archiving, then somewhere in the initdb process some bootstrap process pre-creates a bunch of extra xlog files.  

Is this alarming?  It looks like initdb takes some pains, at least on one place, to make an empty config file rather than using postgresql.conf.sample, but it seems like a sub-process is not doing that.

Those extra log files then give the newly started server a boost (whether it is started in archive mode or not) because it doesn't have to create them itself.  It isn't so much a boost, as the absence of a new-server penalty.  I want to remove that penalty to get better numbers from benchmarking.  What I am doing now is this, between the initdb and the pg_ctl start:

for g in `perl -e 'printf("0000000100000000000000%02X\n",$_) foreach 2..120'`; do cp /tmp/data/pg_xlog/000000010000000000000001 /tmp/data/pg_xlog/$g -i < /dev/null;

The "120" comes from 2 * checkpoint_segments.  That's mighty ugly, is there a better trick?

You could say that benchmarks should run long enough to average out such changes, but needing to run a benchmark that long can make some kinds of work (like git bisect) unrealistic rather than merely tedious.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Event Triggers: adding information
Next
From: Tomas Vondra
Date:
Subject: Re: WIP: store additional info in GIN index