Thread: Hacking postgres backend process

Hacking postgres backend process

From
"Carl E. McMillin"
Date:
Hi All,
 
I posted this subject on General discussion-list but got no takers.  I'll restate my query and be as brief as I possible.
 
"What are the issues/dangers involved in putting an external process-execution call in instance of main postgres-backend thread of execution?"
 
The operating context will be a Linux/UNIX OS.
 
Here is a typical SQL statement I'm trying to field:  "SELECT * FROM f(a)."
 
Where "f" is a stored-procedure stub to a shared library C function,
           "a" is a string-parameter.
 
"f" will need to - under the proper circumstances - call an external process "p", parse the process-output, and return a set of structured records.
 
"p" may run for a very long time; may cause SIG_*; may leave heap in an inconsistent state; may spawn child-processes.
 
I've already written a number of stored-procedures backed by shared libraries implemented in C, including set-returning functions, and I know the basics of user-types and arrays (including some custom array extensions).  I've written UNIX shell processes in C while in school, so I know a bit about child-process control and signal-handling.
 
It seems that "fork" is clearly out; I'm assuming process execution environment MUST be guaranteed consistent on re-entrance into postgres.  Using "exec" is possibly worse with a full image-overlay destroying any hope of reconstructing pre-spawn environment.  What are my options here?
 
Thanks for any input,
 
Carl <|};-)> 

Re: Hacking postgres backend process

From
Alvaro Herrera
Date:
On Wed, Apr 28, 2004 at 08:26:09AM -0700, Carl E. McMillin wrote:

> I posted this subject on General discussion-list but got no takers.  I'll
> restate my query and be as brief as I possible.
>  
> "What are the issues/dangers involved in putting an external
> process-execution call in instance of main postgres-backend thread of
> execution?"

I'm not sure of all the issues it has, but as you probably already know,
a C function has access to anything inside the server process.  This
means it can corrupt private structures, look memory and data bypassing
privileges, etc; and if you get an uncaught SIGSEGV the backend will die
and the postmaster will terminate all running backends.  Basically if
you are in constant fear you are in the right shift of mind to do it ...
check every return code, make sure you don't write unassigned memory,
make sure the function wears its mithril shirt at all times, etc.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"If it wasn't for my companion, I believe I'd be having
the time of my life"  (John Dunbar)


Re: Hacking postgres backend process

From
"Carl E. McMillin"
Date:
> ...Basically if you are in constant fear you are in the
> right shift of mind to do it ... check every return code,
> make sure you don't write unassigned memory, make sure the
> function wears its mithril shirt at all times, etc.

Hehe! Thanks for the warning. Do you know of anyone that's managed to
successfully work these control-structures in with the C api?  I've heard
some good words apropos PL/Perl to control external processes, but I've also
heard there are notable limitations (say absence) with set-returning
functions in PL/Perl (tho perhaps under construction).

Carl <|};-)>



-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@dcc.uchile.cl]
Sent: Tuesday, May 04, 2004 6:29 AM
To: Carl E. McMillin
Cc: pgsql-hackers@postgresql.org; Bob
Subject: Re: [HACKERS] Hacking postgres backend process


On Wed, Apr 28, 2004 at 08:26:09AM -0700, Carl E. McMillin wrote:

> I posted this subject on General discussion-list but got no takers.
> I'll restate my query and be as brief as I possible.
>
> "What are the issues/dangers involved in putting an external
> process-execution call in instance of main postgres-backend thread of
> execution?"

I'm not sure of all the issues it has, but as you probably already know, a C
function has access to anything inside the server process.  This means it
can corrupt private structures, look memory and data bypassing privileges,
etc; and if you get an uncaught SIGSEGV the backend will die and the
postmaster will terminate all running backends.  Basically if you are in
constant fear you are in the right shift of mind to do it ... check every
return code, make sure you don't write unassigned memory, make sure the
function wears its mithril shirt at all times, etc.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"If it wasn't for my companion, I believe I'd be having
the time of my life"  (John Dunbar)