Re: fork/exec patch: pre-CreateProcess finalization - Mailing list pgsql-patches

From Claudio Natoli
Subject Re: fork/exec patch: pre-CreateProcess finalization
Date
Msg-id A02DEC4D1073D611BAE8525405FCCE2B0280AA@harris.memetrics.local
Whole thread Raw
In response to fork/exec patch: pre-CreateProcess finalization  (Claudio Natoli <claudio.natoli@memetrics.com>)
Responses Re: fork/exec patch: pre-CreateProcess finalization  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Thanks for your comments Tom.


Tom Lane writes:
> Claudio Natoli <claudio.natoli@memetrics.com> writes:
> > This has required some reworking of the existing code base, particularly
to
> > BackendFork (which now actually does the fork()). As such, I've been
> > anticipating that this will be the most controversial of
> the fork/exec patches, so critique away :-)
>
> You haven't explained why that's necessary.  Given the fact that this
> patch seems to hugely complicate the postmaster logic --- not so much
> either path individually, but the messy #ifdef interleaving of two
> radically different programs --- I am inclined to reject it on style
> grounds alone.

I have to agree, as I wasn't particularly happy with the BackendFork section
of this patch. You didn't really seem to comment on the pgstat changes,
which were much cleaner, so perhaps (like me) you were happier with these
changes.

So, could you give me an indication as to whether or not making changes to
the BackendFork logic as done for the pgstat logic (specifically, replacing
fork() call with a new, argument marshalling routine to do fork/exec) would
be more acceptable?  The reason I avoided this in the BackendFork case was,
predominantly, code duplication. Creating a new "BackendForkExec" would, I
think, result in a lot of similarity between the two routines. [yes,
BackendFork is poorly named, although I'd still think of moving the fork()
call into BackendFork anyway, thereby justifying its name. But, much more
importantly, for symmetry with the code format that will inevitably arise in
the fork/exec case; see below]


> We should either find a way to make the fork/exec path more like the
> existing code, or bite the bullet and change them both.  Doing it the
> way you have here will create an unmaintainable mess.  I'm not prepared
> to work on a postmaster in which a step as basic as fork() happens in
> two different places depending on an #ifdef.

["unmaintainable mess": yeah, I had the exact same thought when I'd
finished. "No one's ever going to be able to rework this properly".]

I'm afraid that, short of removing all SubPostmaster actions from
BackendFork, the need to do fork in two different places (at least,
physically, if not logically different) will remain. After fork/exec'ing,
the child process can't recover the context needed to create the argument
list which it'll need to build up the PostgresMain arg list.

So, in summary, I see two solutions:

a) do something similar to BackendFork as done for pgstat, specifically:
    - move the fork call into BackendFork
    - create a fork/exec equivalent to BackendFork
    - #ifdef the call to the appropriate function in BackendStartup
  PRO: child is forked in the same logical place
  CON: possible duplication in code between BackendFork + BackendForkExec

b) delay the fork() call to the end of BackendFork in both fork/exec and
normal cases, turning it into solely an argument formatter for PostgresMain
      - move BackendInit, port->cmdline_options parsing + MemoryContext
calls into PostgresMain
  PRO: forking in same (physical) location in code; no duplication
  CON: additional complexity at the start of PostgresMain to avoid these
calls in stand-alone backend case

Which is the lesser of two evils? FWIW, I prefer the latter, as it makes the
normal + fork/exec cases effectively identical, for a little added
complexity to PostgresMain.


There's also the alternative you outlined, but I think it is less workable
than the above options:

> Now it seems to me that a lot of the mess in your latest patch comes
> from trying to set up the backend command string before you've forked,
> and therefore before you've done any of the sub-postmaster actions.
> That's not gonna work.

I agree that this is where the mess comes from, but I think this (setting up
backend command string before fork) is *exactly* what needs to occur in the
fork/exec case (leading into the Win32 CreateProcess calls). If we want this
code to work in the Win32 case, we basically can't do any processing between
the fork + exec calls, and I think recovering all the context to do the
command string marshalling post fork/exec is untenable.


> What I'd suggest is adding a new entry point from main.c, say
> SubPostmasterMain(), with its own command line syntax.  Perhaps
>     postmaster -fork ... additional args ...
> (where -fork is what cues main.c where to dispatch to, and the
> additional args are just those that need to be passed from the
> postmaster to the sub-postmaster, without any of the additional
> information that will be learned by the sub-postmaster during
> client authentication.)
>
> In the Windows case the postmaster would fork/exec to this routine,
> which would re-load as much of the postmaster context as was needed
> and then call BackendFork (which we really oughta rename to something
> else).  BackendFork would execute the client auth process and then
> simply call PostgresMain with a correctly constructed argument list.

Here's the critical difference in our thinking:

    ISTM that we'll have to go to a lot of work to get the fork/exec
case to re-load the context required to format up the argument list to
PostgresMain. Certainly, it is a lot more work than allowing the postmaster
itself (not the forked child) to create the argument list, albeit moving
auth to PostgresMain.

Ok. So, in conclusion we are in agreement on at least one thing: the current
patch sucks.

But, I suspect we may have an impasse on a sole crucial point: setting up
the backend command string before forking.  I'm all for it, and you seem
pretty much against it.

So, I guess I could try to bring you around with another, cleaner patch
using one of the alternatives listed above. But, before doing that, I guess
I should give you a chance to:
a) try to convince me otherwise
b) indicate which of the two alternatives you (even grudgingly :-) prefer
(or another alternative in light of the above)

Again, thanks for your comments, and looking forward to your reply (so that,
hopefully, I can get cracking on a new patch).

Cheers,
Claudio


---
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see
<a
href="http://www.memetrics.com/emailpolicy.html">http://www.memetrics.com/em
ailpolicy.html</a>

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: update i386 spinlock for hyperthreading
Next
From: Tom Lane
Date:
Subject: Re: fork/exec patch: pre-CreateProcess finalization