Re: fork/exec patch: pre-CreateProcess finalization - Mailing list pgsql-patches

From Tom Lane
Subject Re: fork/exec patch: pre-CreateProcess finalization
Date
Msg-id 9680.1072501876@sss.pgh.pa.us
Whole thread Raw
In response to Re: fork/exec patch: pre-CreateProcess finalization  (Claudio Natoli <claudio.natoli@memetrics.com>)
List pgsql-patches
Claudio Natoli <claudio.natoli@memetrics.com> writes:
> Ok. So, in conclusion we are in agreement on at least one thing: the current
> patch sucks.

Good, I'm glad we see eye to eye on that ;-)

> Here's the critical difference in our thinking:

>     ISTM that we'll have to go to a lot of work to get the fork/exec
> case to re-load the context required to format up the argument list to
> PostgresMain. Certainly, it is a lot more work than allowing the postmaster
> itself (not the forked child) to create the argument list, albeit moving
> auth to PostgresMain.

I don't follow your thinking here.  The information that has to be
reloaded *must* be passed across the fork/exec boundary in either case.
For example, there is no alternative to telling the backend process
where PGDATA is: it's got to be passed down some way or another.  The
Unix implementation assumes it can just inherit the value of a static
variable, while the Windows version will have to do something else.
I take no position here on whether we pass it as a command line item,
or through a temporary file, or whatever: it has to be passed down.

The reason your patch is messy is that the PostgresMain command line
string involves both information passed down from the postmaster and
information acquired from the client's connection-request packet.
We could consider redesigning that command line behavior, but we don't
really want to since it would impact the standalone-backend case.  So
I'm proposing substituting two levels of command-line processing within
the exec'd backend process.  The first level is responsible for
transferring information inherited from the postmaster (which we are
going to have to do *anyway*, somewhere, and this provides a nice
localized place to do it) and then the second level is fully compatible
with the standalone-backend case and the Unix-fork case.

> I'm afraid that, short of removing all SubPostmaster actions from
> BackendFork, the need to do fork in two different places (at least,
> physically, if not logically different) will remain.

I am not by any means saying that the fork() call has to stay exactly
where it is in the existing code.  We clearly want to refactor things
so that fork() is close to the invocation of FooMain() (or equivalently
exec() in the Windows case).  I think it's a historical artifact that
these steps got separated in the existing code base.

> After fork/exec'ing,
> the child process can't recover the context needed to create the argument
> list which it'll need to build up the PostgresMain arg list.

If that's literally true, then we are screwed because things cannot
work.  Pardon me for doubting this statement.  All that information
*must* be available in the child, once it has analyzed the information
it must be able to collect from the postmaster and performed the client
authentication handshake.

> So, in summary, I see two solutions:

I still think there's a third answer: create an intermediate layer
that's responsible for recovering the information that has to be
inherited from the parent process.  I don't say this requires zero
rearrangement of existing code, I just say it's a more maintainable
design than either of the alternatives you suggest.

In particular, this alternative is quite unworkable:

> b) delay the fork() call to the end of BackendFork in both fork/exec and
> normal cases, turning it into solely an argument formatter for PostgresMain

We can *not* postpone the fork() until after client authentication.
That loses all of the work that's been done since circa 7.1 to eliminate
cases where the postmaster gets blocked waiting for a single client to
respond.  SSL, PAM, and I think Kerberos all create denial-of-service
risks if we do that.

I believe most of what you are presently struggling with revolves
around the fact that the fork() got pushed up to happen before client
authentication, while the interface to PostgresMain didn't change.
I do not want to alter either of those decisions, however.  What we need
is to provide an alternative exec-able interface that encapsulates the
processing that now occurs between these steps.

            regards, tom lane

pgsql-patches by date:

Previous
From: Claudio Natoli
Date:
Subject: Re: fork/exec patch: pre-CreateProcess finalization
Next
From: Tom Lane
Date:
Subject: Re: Quoting of psql \d output