Hacking postgres backend process

Started by Carl E. McMillinover 21 years ago3 messages
#1Carl E. McMillin
carlymac@earthlink.net

Hi All,

I posted this subject on General discussion-list but got no takers. I'll
restate my query and be as brief as I possible.

"What are the issues/dangers involved in putting an external
process-execution call in instance of main postgres-backend thread of
execution?"

The operating context will be a Linux/UNIX OS.

Here is a typical SQL statement I'm trying to field: "SELECT * FROM f(a)."

Where "f" is a stored-procedure stub to a shared library C function,
"a" is a string-parameter.

"f" will need to - under the proper circumstances - call an external process
"p", parse the process-output, and return a set of structured records.

"p" may run for a very long time; may cause SIG_*; may leave heap in an
inconsistent state; may spawn child-processes.

I've already written a number of stored-procedures backed by shared
libraries implemented in C, including set-returning functions, and I know
the basics of user-types and arrays (including some custom array
extensions). I've written UNIX shell processes in C while in school, so I
know a bit about child-process control and signal-handling.

It seems that "fork" is clearly out; I'm assuming process execution
environment MUST be guaranteed consistent on re-entrance into postgres.
Using "exec" is possibly worse with a full image-overlay destroying any hope
of reconstructing pre-spawn environment. What are my options here?

Thanks for any input,

Carl <|};-)>

#2Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Carl E. McMillin (#1)
Re: Hacking postgres backend process

On Wed, Apr 28, 2004 at 08:26:09AM -0700, Carl E. McMillin wrote:

I posted this subject on General discussion-list but got no takers. I'll
restate my query and be as brief as I possible.

"What are the issues/dangers involved in putting an external
process-execution call in instance of main postgres-backend thread of
execution?"

I'm not sure of all the issues it has, but as you probably already know,
a C function has access to anything inside the server process. This
means it can corrupt private structures, look memory and data bypassing
privileges, etc; and if you get an uncaught SIGSEGV the backend will die
and the postmaster will terminate all running backends. Basically if
you are in constant fear you are in the right shift of mind to do it ...
check every return code, make sure you don't write unassigned memory,
make sure the function wears its mithril shirt at all times, etc.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"If it wasn't for my companion, I believe I'd be having
the time of my life" (John Dunbar)

#3Carl E. McMillin
carlymac@earthlink.net
In reply to: Alvaro Herrera (#2)
Re: Hacking postgres backend process

...Basically if you are in constant fear you are in the
right shift of mind to do it ... check every return code,
make sure you don't write unassigned memory, make sure the
function wears its mithril shirt at all times, etc.

Hehe! Thanks for the warning. Do you know of anyone that's managed to
successfully work these control-structures in with the C api? I've heard
some good words apropos PL/Perl to control external processes, but I've also
heard there are notable limitations (say absence) with set-returning
functions in PL/Perl (tho perhaps under construction).

Carl <|};-)>

-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@dcc.uchile.cl]
Sent: Tuesday, May 04, 2004 6:29 AM
To: Carl E. McMillin
Cc: pgsql-hackers@postgresql.org; Bob
Subject: Re: [HACKERS] Hacking postgres backend process

On Wed, Apr 28, 2004 at 08:26:09AM -0700, Carl E. McMillin wrote:

I posted this subject on General discussion-list but got no takers.
I'll restate my query and be as brief as I possible.

"What are the issues/dangers involved in putting an external
process-execution call in instance of main postgres-backend thread of
execution?"

I'm not sure of all the issues it has, but as you probably already know, a C
function has access to anything inside the server process. This means it
can corrupt private structures, look memory and data bypassing privileges,
etc; and if you get an uncaught SIGSEGV the backend will die and the
postmaster will terminate all running backends. Basically if you are in
constant fear you are in the right shift of mind to do it ... check every
return code, make sure you don't write unassigned memory, make sure the
function wears its mithril shirt at all times, etc.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"If it wasn't for my companion, I believe I'd be having
the time of my life" (John Dunbar)