Thread: Re: Why does bootstrap and later initdb stages happen via client?

Re: Why does bootstrap and later initdb stages happen via client?

From

Andrew Dunstan

Date:

08 September 2021, 23:24:00

On 9/8/21 3:07 PM, Andres Freund wrote:
> Hi,
>
> While hacking on AIO I wanted to build the windows portion from linux. That
> works surprisingly well with cross-building using --host=x86_64-w64-mingw32 .
>
> What didn't work as well was running things under wine. It turns out that the
> server itself works ok, but that initdb hangs because of a bug in wine ([1]),
> leading to the bootstrap process hanging while trying to read more input.
>
>
> Which made me wonder: What is really the point of doing so much setup as part
> of initdb? Of course a wine bug isn't a reason to change anything, but I see
> other reasons it might be worth thinking about moving more of initdb's logic
> into the backend.
>
> There of course is historical raisins for things happening in initdb - the
> setup logic didn't use to be C. But now that it is C, it seems a bit absurd to
> read bootstrap data in initdb, write the data to a pipe, and then read it
> again in the backend. It for sure doesn't make things faster.


I guess the downside would be that we'd need to teach the backend how to
do more stuff that only needs to be done once per cluster, and then that
code would be dead space for the rest of the lifetime of the cluster.


Maybe the difference is sufficiently small that it doesn't matter.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Why does bootstrap and later initdb stages happen via client?

From

Andres Freund

Date:

09 September 2021, 00:48:44

Hi,

On September 8, 2021 1:24:00 PM PDT, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>On 9/8/21 3:07 PM, Andres Freund wrote:
>> There of course is historical raisins for things happening in initdb - the
>> setup logic didn't use to be C. But now that it is C, it seems a bit absurd to
>> read bootstrap data in initdb, write the data to a pipe, and then read it
>> again in the backend. It for sure doesn't make things faster.
>
>
>I guess the downside would be that we'd need to teach the backend how to
>do more stuff that only needs to be done once per cluster, and then that
>code would be dead space for the rest of the lifetime of the cluster.
>
>
>Maybe the difference is sufficiently small that it doesn't matter.

Unused code doesn't itself cost much - the OS won't even page it in. And disk space wise, there's not much difference
betweencode in initdb and code in postgres. It's callsites to the code that can be problematic. But there were already
payingthe price via --boot and a fair number of if (bootstrap) blocks. 

Regards,

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Why does bootstrap and later initdb stages happen via client?

From

Andrew Dunstan

Date:

09 September 2021, 18:52:37

On 9/8/21 5:48 PM, Andres Freund wrote:
> Hi,
>
> On September 8, 2021 1:24:00 PM PDT, Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 9/8/21 3:07 PM, Andres Freund wrote:
>>> There of course is historical raisins for things happening in initdb - the
>>> setup logic didn't use to be C. But now that it is C, it seems a bit absurd to
>>> read bootstrap data in initdb, write the data to a pipe, and then read it
>>> again in the backend. It for sure doesn't make things faster.
>>
>> I guess the downside would be that we'd need to teach the backend how to
>> do more stuff that only needs to be done once per cluster, and then that
>> code would be dead space for the rest of the lifetime of the cluster.
>>
>>
>> Maybe the difference is sufficiently small that it doesn't matter.
> Unused code doesn't itself cost much - the OS won't even page it in. And disk space wise, there's not much difference
betweencode in initdb and code in postgres. It's callsites to the code that can be problematic. But there were already
payingthe price via --boot and a fair number of if (bootstrap) blocks.
 
>

Fair enough. You're quite right, of course, the original design of
initdb.c was to do what the preceding shell script did as closely as
possible. It does leak a bit of memory, which doesn't matter in the
context of a short-lived program - but that shouldn't be too hard to
manage in the backend.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com