Win32 lost signals open item

Started by Bruce Momjianabout 21 years ago14 messages

pgman@candle.pha.pa.us

about 21 years ago

We have this open item:

Win32
o Handle "lost signals" on backend startup (eg. shutdown,
config file changes, etc); signals are SIG_DFL on startup

The problem here is that the postmaster might send signals to a
child before the signal handlers are installed. We don't have
this problem on unix because we fork and inherit the signal
handlers.

Win32 uses a special socket to receive signals and does not use the
standard Unix signal mechanism. However, the socket doesn't exist on
backend process start so there is possible loss of signal while the
backend starts.

The only solution I can think of for us is to set a PROC struct variable
when you can't send the Win32 backend a signal and have the backend
check this PROC variable after it starts listening for signals.
However, there would still be a window where the signal could fail but
the backend could check the variable before the postmaster sets it so we
might just set the variable before a signal is sent and because it is
only checked when we start listening for signals it should be OK.

However, I don't think the postmaster reads/writes PROC so we would need
some other way of flagging the backend. I bet there is some Win32 API
that might help us.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Tom Lane

tgl@sss.pgh.pa.us

about 21 years ago

In reply to: Bruce Momjian (#1)

Re: [pgsql-hackers-win32] Win32 lost signals open item

Bruce Momjian <pgman@candle.pha.pa.us> writes:

We have this open item:
Win32
o Handle "lost signals" on backend startup (eg. shutdown,
config file changes, etc); signals are SIG_DFL on startup

The problem here is that the postmaster might send signals to a
child before the signal handlers are installed. We don't have
this problem on unix because we fork and inherit the signal
handlers.

FWIW, I think the todo's description of the problem is completely
inaccurate. The issue is not the lack of signal handler settings per
se, it is that our pipe-based emulation of signals isn't ready to
collect signal messages until some time after the child process starts.

Could this be fixed by having the postmaster set up the pipe *before* it
forks/execs the child? We'd probably need to pass down some additional
info to inform the child where it has to hook into the pipe structure,
but passing down more state is no problem.

The only solution I can think of for us is to set a PROC struct variable
when you can't send the Win32 backend a signal and have the backend
check this PROC variable after it starts listening for signals.

A backend does not create its PROC entry until *long* after it gets
forked, so this does not sound like a path to a solution. Also, I'd
prefer to be able to signal non-backend children such as pgstat. (I'm
not sure if the current code actually needs that, but I can definitely
believe that we'll need to do it some day.) Also, we do need to be able
to signal the postmaster from backends, so we cannot tie the signal
mechanism to the assumption that every signalable process has or will
eventually have a PROC entry.

regards, tom lane

Bruce Momjian

pgman@candle.pha.pa.us

about 21 years ago

In reply to: Tom Lane (#2)

Re: [pgsql-hackers-win32] Win32 lost signals open item

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

We have this open item:
Win32
o Handle "lost signals" on backend startup (eg. shutdown,
config file changes, etc); signals are SIG_DFL on startup

The problem here is that the postmaster might send signals to a
child before the signal handlers are installed. We don't have
this problem on unix because we fork and inherit the signal
handlers.

FWIW, I think the todo's description of the problem is completely
inaccurate. The issue is not the lack of signal handler settings per

OK, updated:

o Handle "lost signals" on backend startup (eg. shutdown,
config file changes, etc); signals are not possible on
startup

The problem here is that the postmaster might send signals to a
child before the Win32 pipe is created to accept signals.
We don't have this problem on unix because we fork and inherit
the signal handlers.

se, it is that our pipe-based emulation of signals isn't ready to
collect signal messages until some time after the child process starts.

Could this be fixed by having the postmaster set up the pipe *before* it
forks/execs the child? We'd probably need to pass down some additional
info to inform the child where it has to hook into the pipe structure,
but passing down more state is no problem.

Not sure. Magnus?

The only solution I can think of for us is to set a PROC struct variable
when you can't send the Win32 backend a signal and have the backend
check this PROC variable after it starts listening for signals.

A backend does not create its PROC entry until *long* after it gets
forked, so this does not sound like a path to a solution. Also, I'd
prefer to be able to signal non-backend children such as pgstat. (I'm
not sure if the current code actually needs that, but I can definitely
believe that we'll need to do it some day.) Also, we do need to be able
to signal the postmaster from backends, so we cannot tie the signal
mechanism to the assumption that every signalable process has or will
eventually have a PROC entry.

OK.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Magnus Hagander

mha@sollentuna.net

about 21 years ago

In reply to: Bruce Momjian (#3)

Re: [pgsql-hackers-win32] Win32 lost signals open item

se, it is that our pipe-based emulation of signals isn't ready to
collect signal messages until some time after the child

process starts.

Could this be fixed by having the postmaster set up the pipe

*before* it

forks/execs the child? We'd probably need to pass down some

additional

info to inform the child where it has to hook into the pipe

structure,

but passing down more state is no problem.

Not sure. Magnus?

I think it can be made to work, but it's not as easy as one would hope.
Remember, the pipe is called \\.\pipe\pqsignal_<pid>, with the current
pid as a part of it.

The way to do it, I *think*, is to change the forkexec call to do (this
is all untested):
1) CreateProcess, with the flag CREATE_SUSPENDED. This creates all the
process structures, but specifically does not start the main thread.
This gives us the process id and handle.
2) Create the pipe with the correct name
3) Call DuplicateHandle() on the pipe to make it available in the
sub-process
4) Close the original pipe handle
5) Write the new handle to the backend parameter file somewhere
6) ResumeThread() on the new process, which actually starts the backend

As you can see, this is quite a bit more complicated than the simple
CreateProcess() call we have now.

It does, however, have a different advantage as well. The same kind of
thing will be required for a fix of the infamous "socket operation on
non socket because of LSP problems" we've been seeing quite a number of
reports on. The change being that instead of DuplicateHandle() we'd do
WSADuplicateSocket() for sockets to inherit.

If this seems like a reasonable approach, I can see if I can get
something together. But it's a fairly large change..

(Or if someone can poke a hole in the idea before I start, then please
do so..)

//Magnus

Import Notes

Resolved by subject fallback

Tom Lane

tgl@sss.pgh.pa.us

about 21 years ago

In reply to: Magnus Hagander (#4)

Re: [pgsql-hackers-win32] Win32 lost signals open item

"Magnus Hagander" <mha@sollentuna.net> writes:

[ proposed fix ]
As you can see, this is quite a bit more complicated than the simple
CreateProcess() call we have now.
...
If this seems like a reasonable approach, I can see if I can get
something together. But it's a fairly large change..

It sounds reasonable to me, in the sense that it is a localized change,
even if rather messy. (Perhaps this chunk of code should be pushed into
src/port someplace, instead of being dropped into postmaster.c?)

regards, tom lane

Magnus Hagander

mha@sollentuna.net

about 21 years ago

In reply to: Tom Lane (#5)

Re: [pgsql-hackers-win32] Win32 lost signals open item

[ proposed fix ]
As you can see, this is quite a bit more complicated than the simple
CreateProcess() call we have now.
...
If this seems like a reasonable approach, I can see if I can get
something together. But it's a fairly large change..

It sounds reasonable to me, in the sense that it is a localized change,
even if rather messy. (Perhaps this chunk of code should be
pushed into
src/port someplace, instead of being dropped into postmaster.c?)

That's an idea. However, since this has to be tied in with the creationi
of the backend-parameter-file, it's going to have to hit
write_backend_variables() in some way, and have access to postmaster
variables. I think it might be even more messy to push it out.

Basically, I think internal_forkexec() needs to be split up into two -
one win32 and one other. For win32 version, it needs to CreateProcess()
*before* it does write_backend_variables(), and then pass the process id
as a parameter to write_backend_vars().

I think if it should be broken out, we need to break the whole package
of internal_forkexec, read/write_backend_vars, and probably
SubPostmasterMain() out somewhere. Not sure it's worth it.

//Magnus

Import Notes

Resolved by subject fallback

Tom Lane

tgl@sss.pgh.pa.us

about 21 years ago

In reply to: Magnus Hagander (#6)

Re: [pgsql-hackers-win32] Win32 lost signals open item

"Magnus Hagander" <mha@sollentuna.net> writes:

Basically, I think internal_forkexec() needs to be split up into two -
one win32 and one other. For win32 version, it needs to CreateProcess()
*before* it does write_backend_variables(), and then pass the process id
as a parameter to write_backend_vars().

Huh? Why?

regards, tom lane

Magnus Hagander

mha@sollentuna.net

about 21 years ago

In reply to: Tom Lane (#7)

Re: [pgsql-hackers-win32] Win32 lost signals open item

Basically, I think internal_forkexec() needs to be split up

into two -

one win32 and one other. For win32 version, it needs to

CreateProcess()

*before* it does write_backend_variables(), and then pass

the process id

as a parameter to write_backend_vars().

Huh? Why?

Because we need to write the duplicated socket structure/pipe handle to
the parameter file. I guess we could create a separate parameter file
just for these things, but that seemed a bit unnecessary.

Basically:
1) To create the new HANDLE, or in the case of a socket, the
WSAPROTOCOL_INFO structure, we need to know the new pid. We can only
know this after CreateProcess(), which sits in win32_forkexec.
2) To get the data down to the backend, we need to put it in the
parameter file, which is done using write_backend_vars().

This gives that we either:
1) Write the data out in write_backend_vars, in which case
write_backend_vars needs to know the pid of the new backend.
or
2) Write the data out in win32_forkexec, in which case we need another
parameter file.

I was thinking (1) was the cleaner approach.

//Magnus

Import Notes

Resolved by subject fallback

Tom Lane

tgl@sss.pgh.pa.us

about 21 years ago

In reply to: Magnus Hagander (#8)

Re: [pgsql-hackers-win32] Win32 lost signals open item

"Magnus Hagander" <mha@sollentuna.net> writes:

Huh? Why?

Because we need to write the duplicated socket structure/pipe handle to
the parameter file. I guess we could create a separate parameter file
just for these things, but that seemed a bit unnecessary.

Do we actually need to pass the handle, or could the subprocess reopen
the pipe for itself?

regards, tom lane

#10

Magnus Hagander

mha@sollentuna.net

about 21 years ago

In reply to: Tom Lane (#9)

Re: [pgsql-hackers-win32] Win32 lost signals open item

Huh? Why?

Because we need to write the duplicated socket

structure/pipe handle to

the parameter file. I guess we could create a separate parameter file
just for these things, but that seemed a bit unnecessary.

Do we actually need to pass the handle, or could the subprocess reopen
the pipe for itself?

Nope, we need to pass the handle. Only one process can be the
server-side of the pipe, and once the postmaster has opened it, the
child process can't do it - the only way to get it is through
inheritance.

//Magnus

Import Notes

Resolved by subject fallback

#11

Tom Lane

tgl@sss.pgh.pa.us

about 21 years ago

In reply to: Magnus Hagander (#10)

Re: [pgsql-hackers-win32] Win32 lost signals open item

"Magnus Hagander" <mha@sollentuna.net> writes:

Nope, we need to pass the handle. Only one process can be the
server-side of the pipe, and once the postmaster has opened it, the
child process can't do it - the only way to get it is through
inheritance.

Grumble. Having to call write_backend_variables from two different
places seems Really Ugly.

How about Plan B?  It occurs to me that what this proposal really does
is to delay the postmaster until after the child process has been
created.  What about doing that in some more straightforward fashion
--- that is, the postmaster doesn't return from win32_forkexec until
it sees that the child has gotten at least as far as being able to
accept signals?  (A simple way to do it would just be to loop trying to
kill(0) the child PID.)  I guess the tricky part here is recovering if
the child fails to start at all --- a timeout would perhaps do but it's
ugly.  Still, seems less ugly than the way this idea is panning out.

regards, tom lane

#12

Magnus Hagander

mha@sollentuna.net

about 21 years ago

In reply to: Tom Lane (#11)

Re: [pgsql-hackers-win32] Win32 lost signals open item

Nope, we need to pass the handle. Only one process can be the
server-side of the pipe, and once the postmaster has opened it, the
child process can't do it - the only way to get it is through
inheritance.

Grumble. Having to call write_backend_variables from two different
places seems Really Ugly.

Well, if they're separated by
#ifdef win32
first one
#else
second one
#endif

that's slightly less Ugly. Still Ugly, just maybe not Really Ugly.

The actual code-duplication is only 2-3 statements + three asserts, so
it's not *that* bad. The rest of internal_forkexec() is already
#ifdef-split between win32 and non-win32.

How about Plan B?  It occurs to me that what this proposal really does
is to delay the postmaster until after the child process has been
created.  What about doing that in some more straightforward fashion
--- that is, the postmaster doesn't return from win32_forkexec until
it sees that the child has gotten at least as far as being able to
accept signals?  (A simple way to do it would just be to loop trying to
kill(0) the child PID.)  I guess the tricky part here is recovering if
the child fails to start at all --- a timeout would perhaps do but it's
ugly.  Still, seems less ugly than the way this idea is panning out.

I actually think this one is more ugly :-( It also makes the postmaster
stop what it's doing for a while during the backend startup. With lots
of backend starts, that might become noticeable.

I was also hoping to piggyback the socket fix on top of this
infrastructure. And that *requires* the write-files-after-createprocess
method. There is no other way. Doing it this way means we either give up
on the socket fix completely, or that we do it *both* ways, which seems
like the worst of both worlds.

(If you forgot, this is the bug that has us saying "uninstall your
antivirus and firewall software", which is generally not such brilliant
advice on windows)

//Magnus

Import Notes

Resolved by subject fallback

#13

Tom Lane

tgl@sss.pgh.pa.us

about 21 years ago

In reply to: Magnus Hagander (#12)

Re: [pgsql-hackers-win32] Win32 lost signals open item

"Magnus Hagander" <mha@sollentuna.net> writes:

I was also hoping to piggyback the socket fix on top of this
infrastructure. And that *requires* the write-files-after-createprocess
method. There is no other way.

Oh, I had forgotten about that part of the problem. Okay, just gotta
hold our noses and do it I guess.

(Just to be clear: the plan is CreateProcess in suspended state, then
write_backend_variables, then start the child's thread, right?
Otherwise seems you got a race condition.)

regards, tom lane

#14

Magnus Hagander

mha@sollentuna.net

about 21 years ago

In reply to: Tom Lane (#13)

Re: [pgsql-hackers-win32] Win32 lost signals open item

I was also hoping to piggyback the socket fix on top of this
infrastructure. And that *requires* the

write-files-after-createprocess

method. There is no other way.

Oh, I had forgotten about that part of the problem. Okay, just gotta
hold our noses and do it I guess.

(Just to be clear: the plan is CreateProcess in suspended state, then
write_backend_variables, then start the child's thread, right?
Otherwise seems you got a race condition.)

Right, that's the plan.

I'll see what I can get together over the next couple of days.

//Magnus

Import Notes

Resolved by subject fallback