Proposal to add a QNX 6.5 port to PostgreSQL

Started by Baker, Keith [OCDUS Non-J&J]over 11 years ago55 messageshackers
Jump to latest

I propose that a QNX 6.5 port be introduced to PostgreSQL.

I am new to PostgreSQL development, so please bear with me.

I have made good progress (with 1 outstanding issue, details below):

* I created a QNX 6.5 port of PostgreSQL 9.3.4 which passes regression tests.

* I merged my changes into 9.4beta2, and with a few minor changes, it passes regression tests.

* QNX support states that QNX 6.5 SP1 binaries run on QNX 6.6 without modification, which I confirmed with a few quick tests.

Summary of changes required for PostgreSQL 9.3.4 on QNX 6.5:

* Typical changes required for any new port (template, configure.in, dynloader, etc.)

* QNX lacks System V shared memory: I created "src/backend/port/posix_shmem.c" which replaces System V calls (shmget, shmat, shmdt, ...) with POSIX calls (shm_open, mmap, munmap, shm_unlink)

* QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" to define macros to retry system calls upon EINTR (open,read,write,...) when compiled on QNX

* A few files required addition of #include <sys/select.h> on QNX (for fd_set).

Additional changes required for PostgreSQL9.4beta2 on QNX 6.5:

* "DSM" changes introduced in 9.4 (R. Haas) required that I make minor updates to my new "posix_shmem.c" code.

* src\include\replication\logical.h: struct LogicalDecodingContext field "write" interferes with my "write" retry macro. Renaming field "write" to "do_write" solved this problem.

Outstanding Issue #1:
src/backend/commands/dbcommands.c :: createdb() complains when copying template1 to template0 (apparently a locale issue)
"FATAL: 22023: new LC_CTYPE (C;collate:POSIX;ctype:POSIX) is incompatible with the LC_CTYPE of the template database (POSIX;messages:C)"
I would appreciate help from an experienced PostgreSQL hacker to address this.
I have temporarily disabled this check on QNX (I can live with the assumption/limitation that template0 and template1 contain strictly ASCII).

I can work toward setting up a build farm member should this proposal be accepted.
Your feedback and guidance on next steps is appreciated.

Thank you.

Keith Baker

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Baker, Keith [OCDUS Non-J&J] (#1)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:

I propose that a QNX 6.5 port be introduced to PostgreSQL.

Hmm ... you're aware that there used to be a QNX port? We removed it
back in 2006 for lack of interest and maintainers, and AFAIR you're
the first person to show any interest in reintroducing it since then.

I'm a bit concerned about reintroducing something that seems to have so
little usage, especially if the port is going to be as invasive as you
suggest:

* QNX lacks System V shared memory: I created "src/backend/port/posix_shmem.c" which replaces System V calls (shmget, shmat, shmdt, ...) with POSIX calls (shm_open, mmap, munmap, shm_unlink)

This isn't really acceptable for production usage; if it were, we'd have
done it already. The POSIX APIs lack any way to tell how many processes
are attached to a shmem segment, which is *necessary* functionality for
us (it's a critical part of the interlock against starting multiple
postmasters in one data directory).

* QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" to define macros to retry system calls upon EINTR (open,read,write,...) when compiled on QNX

That's pretty scary too. For one thing, such macros would affect every
call site whether it's running with SA_RESTART or not. Do you really
need it? It looks to me like we just turn off HAVE_POSIX_SIGNALS if
you don't have SA_RESTART. Maybe that code has bit-rotted by now, but
it did work at one time.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Baker, Keith [OCDUS Non-J&J] (#1)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On Fri, Jul 25, 2014 at 3:16 PM, Baker, Keith [OCDUS Non-J&J]
<KBaker9@its.jnj.com> wrote:

I propose that a QNX 6.5 port be introduced to PostgreSQL.

I am new to PostgreSQL development, so please bear with me.

I have made good progress (with 1 outstanding issue, details below):

· I created a QNX 6.5 port of PostgreSQL 9.3.4 which passes
regression tests.

· I merged my changes into 9.4beta2, and with a few minor changes,
it passes regression tests.

· QNX support states that QNX 6.5 SP1 binaries run on QNX 6.6
without modification, which I confirmed with a few quick tests.

Summary of changes required for PostgreSQL 9.3.4 on QNX 6.5:

· Typical changes required for any new port (template, configure.in,
dynloader, etc.)

· QNX lacks System V shared memory: I created
“src/backend/port/posix_shmem.c” which replaces System V calls (shmget,
shmat, shmdt, …) with POSIX calls (shm_open, mmap, munmap, shm_unlink)

· QNX lacks sigaction SA_RESTART: I modified “src/include/port.h” to
define macros to retry system calls upon EINTR (open,read,write,…) when
compiled on QNX

· A few files required addition of #include <sys/select.h> on QNX
(for fd_set).

Additional changes required for PostgreSQL9.4beta2 on QNX 6.5:

· “DSM” changes introduced in 9.4 (R. Haas) required that I make
minor updates to my new “posix_shmem.c” code.

· src\include\replication\logical.h: struct LogicalDecodingContext
field “write” interferes with my “write” retry macro. Renaming field
“write” to “do_write” solved this problem.

Outstanding Issue #1:

src/backend/commands/dbcommands.c :: createdb() complains when copying
template1 to template0 (apparently a locale issue)

“FATAL: 22023: new LC_CTYPE (C;collate:POSIX;ctype:POSIX) is incompatible
with the LC_CTYPE of the template database (POSIX;messages:C)”

I would appreciate help from an experienced PostgreSQL hacker to address
this.

I have temporarily disabled this check on QNX (I can live with the
assumption/limitation that template0 and template1 contain strictly ASCII).

I can work toward setting up a build farm member should this proposal be
accepted.

Maybe step #1 is to get a buildfarm member set up. Is there any
policy against unsupported environments in the buildfarm? (I hope not)

You're going to have to run it against a git repository containing
your custom patches. It's a long and uncertain road to getting a new
port (re-) accepted, but demonstrated commitment to support is a
necessary first step. It will also advertise support for the platform.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@anarazel.de
In reply to: Merlin Moncure (#3)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On 2014-07-28 11:19:48 -0500, Merlin Moncure wrote:

Maybe step #1 is to get a buildfarm member set up. Is there any
policy against unsupported environments in the buildfarm? (I hope not)

You're going to have to run it against a git repository containing
your custom patches. It's a long and uncertain road to getting a new
port (re-) accepted, but demonstrated commitment to support is a
necessary first step. It will also advertise support for the platform.

I don't think a buildfarm animal that doesn't run the actual upstream
code is a good idea. That'll make it a lot harder to understand what's
going on when something breaks after a commit. It'd also require the
custom patches being rebased ontop of $branch before every run...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Merlin Moncure
mmoncure@gmail.com
In reply to: Andres Freund (#4)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On Mon, Jul 28, 2014 at 11:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-07-28 11:19:48 -0500, Merlin Moncure wrote:

Maybe step #1 is to get a buildfarm member set up. Is there any
policy against unsupported environments in the buildfarm? (I hope not)

You're going to have to run it against a git repository containing
your custom patches. It's a long and uncertain road to getting a new
port (re-) accepted, but demonstrated commitment to support is a
necessary first step. It will also advertise support for the platform.

I don't think a buildfarm animal that doesn't run the actual upstream
code is a good idea. That'll make it a lot harder to understand what's
going on when something breaks after a commit. It'd also require the
custom patches being rebased ontop of $branch before every run...

hm. oh well. maybe if there was a separate page for custom builds
(basically, an unsupported section).

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Merlin Moncure (#5)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On Mon, Jul 28, 2014 at 9:41 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

I don't think a buildfarm animal that doesn't run the actual upstream
code is a good idea. That'll make it a lot harder to understand what's
going on when something breaks after a commit. It'd also require the
custom patches being rebased ontop of $branch before every run...

hm. oh well. maybe if there was a separate page for custom builds
(basically, an unsupported section).

I think that's a bad idea. The QNX OS seems to be mostly used in
safety-critical systems; it has a microkernel design. I think it would
be particularly bad to have iffy support for something like that.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

* QNX lacks System V shared memory: I created "src/backend/port/posix_shmem.c" which replaces System V calls (shmget, shmat, shmdt, ...) with POSIX calls (shm_open, mmap, munmap, shm_unlink)

This isn't really acceptable for production usage; if it were, we'd have
done it already. The POSIX APIs lack any way to tell how many processes
are attached to a shmem segment, which is *necessary* functionality for
us (it's a critical part of the interlock against starting multiple
postmasters in one data directory).

I think it would be good to spend some energy figuring out what to do
about this. The Linux developers, for reasons I have not been able to
understand, appear to hate System V shared memory, and rumors have
circulated here that they would like to get rid of it altogether. And
quite apart from that, even using a few bytes of System V shared
memory is apparently inconvenient for people who run many copies of
PostgreSQL on the same machine or who run in environments where it's
not available, such as FreeBSD jails for which it hasn't been
specifically enabled.[1]See comments on http://rhaas.blogspot.com/2012/06/absurd-shared-memory-limits.html

Now, in fairness, all of the alternative systems have their own share
of problems. POSIX shared memory isn't available everywhere, and the
anonymous mmap we're now using doesn't work in EXEC_BACKEND builds,
can't be used for dynamic shared memory, and apparently performs
poorly on BSD systems.[1]See comments on http://rhaas.blogspot.com/2012/06/absurd-shared-memory-limits.html In spite of that, I think that having an
option to use POSIX shared memory would make a reasonable number of
PostgreSQL users happier than they are today; and maybe even attract a
few new ones.

In our last discussion on this topic, we talked about using file locks
as a substitute for nattch. You concluded that fcntl was totally
broken for this purpose because of the possibility of some other piece
of code accidentally opening and closing the lock file.[2]/messages/by-id/18958.1340764854@sss.pgh.pa.us lockf
appears to have the same problem, but flock might not, at least on
some systems. The semantics as described in my copy of the Linux man
pages are that a child created by fork() inherits a copy of the
filehandle pointing to the same lock, and that the lock is released
when either ANY process with a copy of that filehandle makes an
explicit unlock request or ALL copies of the filehandle are closed.
That seems like it'd be OK for our purposes, though the Linux guys
seem to think the semantics might be different on other platforms, and
note that it won't work over NFS.

Another thing that strikes me is that lsof works on just about every
platform I've ever used, and it tells you who has got a certain file
open. Of course it has to use different methods to do that on
different platforms, but at least on Linux, /proc/self/fd/N is a
symlink to the file you've got open, and shared memory segments are
files in /dev/shm. So maybe at least on particular platforms where we
care enough, we could install operating-system-specific code to
provide an interlock using a mechanism of this type. Not sure if that
will fly, but it's a thought.

Yet another idea is to somehow use POSIX semaphores, which are
distinct from POSIX shared memory. semop() has a SEM_UNDO flag which
causes whatever operation you perform to reversed out a process exit.
So you could have each new postgres process increment the semaphore
value in such a way that it would be decremented on exit, although I'm
not sure how to avoid a race if the postmaster dies before a new child
has a chance to increment the semaphore.

Finally, how about named pipes? Linux says that trying to open a
named pipe for write when there are no readers will return ENXIO, and
attempting to write to an already-open pipe with no remaining readers
will cause SIGPIPE. So: create a permanent named pipe in the data
directory that all PostgreSQL processes keep open. When the
postmaster starts, it opens the pipe for read, then for write, then
closes it for read. It then tries to write to the pipe. If this
fails to result in SIGPIPE, then somebody else has got the thing open;
so the new postmaster should die at once. But if does get a SIGPIPE
then there are as of that moment no other readers.

I'm not sure if any of this helps QNX or not, but maybe if we figure
out which of these mechanisms (or others) might be acceptable we can
cross-check that against what QNX supports.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1]: See comments on http://rhaas.blogspot.com/2012/06/absurd-shared-memory-limits.html
http://rhaas.blogspot.com/2012/06/absurd-shared-memory-limits.html
[2]: /messages/by-id/18958.1340764854@sss.pgh.pa.us

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This isn't really acceptable for production usage; if it were, we'd have
done it already. The POSIX APIs lack any way to tell how many processes
are attached to a shmem segment, which is *necessary* functionality for
us (it's a critical part of the interlock against starting multiple
postmasters in one data directory).

I think it would be good to spend some energy figuring out what to do
about this.

Well, we've been around on this multiple times before, but if we have
any new ideas, sure ...

In our last discussion on this topic, we talked about using file locks
as a substitute for nattch. You concluded that fcntl was totally
broken for this purpose because of the possibility of some other piece
of code accidentally opening and closing the lock file.[2] lockf
appears to have the same problem, but flock might not, at least on
some systems.

My Linux man page for flock says

flock() does not lock files over NFS. Use fcntl(2) instead: that does
work over NFS, given a sufficiently recent version of Linux and a
server which supports locking.

which seems like a showstopper problem; we might try to tell people not to
put their databases on NFS, but they're not gonna listen. It also says

flock() and fcntl(2) locks have different semantics with respect to
forked processes and dup(2). On systems that implement flock() using
fcntl(2), the semantics of flock() will be different from those
described in this manual page.

which is pretty scary if it's accurate for any still-extant platforms;
we might think we're using flock and still get fcntl behavior. It's
also of concern that (AFAICS) flock is not in POSIX, which means we
can't even expect that platforms will agree on how it *should* behave.

I also noted that flock does not support atomic downgrade of exclusive
lock to shared lock, which seems like a problem for the lock inheritance
scheme sketched in
/messages/by-id/18162.1340761845@sss.pgh.pa.us
... but OTOH, it sounds like flock locks are not only inherited through
fork() but even preserved across exec(), which would mean that we don't
need that scheme for file lock inheritance, even with EXEC_BACKEND.
Still, it's not clear to me how we could put much faith in flock.

Finally, how about named pipes? Linux says that trying to open a
named pipe for write when there are no readers will return ENXIO, and
attempting to write to an already-open pipe with no remaining readers
will cause SIGPIPE. So: create a permanent named pipe in the data
directory that all PostgreSQL processes keep open. When the
postmaster starts, it opens the pipe for read, then for write, then
closes it for read. It then tries to write to the pipe. If this
fails to result in SIGPIPE, then somebody else has got the thing open;
so the new postmaster should die at once. But if does get a SIGPIPE
then there are as of that moment no other readers.

Hm. That particular protocol is broken: two postmasters doing it at the
same time would both pass (because neither has it open for read at the
instant where they try to write). But we could possibly frob the idea
until it works. Bigger question is how portable is this behavior?
I see named pipes (fifos) in SUS v2, which is our usual baseline
assumption about what's portable across Unixen, so maybe it would work.
But does NFS support named pipes?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Tom Lane (#8)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

Thank you to all who have responded to this proposal.
PostgreSQL manages to meet all production requirements on Windows without System V shared memory, so I would think this can be achieved on QNX/Linux.

The old PostgreSQL QNX port ran on the very old "QNX4" (1991), so I understand why it would be of little value today.
Currently, QNX Neutrino 6.5 is well established (and QNX 6.6 is emerging) and that is where a PostgreSQL port would be well received.

I have attached my current work-in-progress patches for 9.3.4 and 9.4beta2 for the curious.
To minimize risk, I have been careful to ensure my changes will have effect only QNX builds, existing ports should see zero impact.
To minimize addition of new files, I have used the "linux" template rather than add qnx6 as a separate port/template.

All regression tests pass on my system, so while not perfect it is at least a reasonable start.
posix_shmem.c is still in need of some cleanup and mitigations to make it "production-strength".

If there are existing tests I can run to ensure the QNX port meets your criteria for robust failure handling in this area I would be happy to run them.
If not, perhaps someone can provide a quick list of failure modes to consider.
As-is:
- starting of a second postmaster fails with message 'FATAL: lock file "postmaster.pid" already exists'
- Kill -9 of postmaster followed by a pg_ctl start seems to go through recovery, although the original shared memory segments hang out in /dev/shmem until reboot (that could be better).

Thanks again and please let me know if I can be of any assistance.

Keith Baker

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 29, 2014 7:06 PM
To: Robert Haas
Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Jul 25, 2014 at 6:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This isn't really acceptable for production usage; if it were, we'd
have done it already. The POSIX APIs lack any way to tell how many
processes are attached to a shmem segment, which is *necessary*
functionality for us (it's a critical part of the interlock against
starting multiple postmasters in one data directory).

I think it would be good to spend some energy figuring out what to do
about this.

Well, we've been around on this multiple times before, but if we have any new ideas, sure ...

In our last discussion on this topic, we talked about using file locks
as a substitute for nattch. You concluded that fcntl was totally
broken for this purpose because of the possibility of some other piece
of code accidentally opening and closing the lock file.[2] lockf
appears to have the same problem, but flock might not, at least on
some systems.

My Linux man page for flock says

flock() does not lock files over NFS. Use fcntl(2) instead: that does
work over NFS, given a sufficiently recent version of Linux and a
server which supports locking.

which seems like a showstopper problem; we might try to tell people not to put their databases on NFS, but they're not gonna listen. It also says

flock() and fcntl(2) locks have different semantics with respect to
forked processes and dup(2). On systems that implement flock() using
fcntl(2), the semantics of flock() will be different from those
described in this manual page.

which is pretty scary if it's accurate for any still-extant platforms; we might think we're using flock and still get fcntl behavior. It's also of concern that (AFAICS) flock is not in POSIX, which means we can't even expect that platforms will agree on how it *should* behave.

I also noted that flock does not support atomic downgrade of exclusive lock to shared lock, which seems like a problem for the lock inheritance scheme sketched in /messages/by-id/18162.1340761845@sss.pgh.pa.us
... but OTOH, it sounds like flock locks are not only inherited through
fork() but even preserved across exec(), which would mean that we don't need that scheme for file lock inheritance, even with EXEC_BACKEND.
Still, it's not clear to me how we could put much faith in flock.

Finally, how about named pipes? Linux says that trying to open a named
pipe for write when there are no readers will return ENXIO, and
attempting to write to an already-open pipe with no remaining readers
will cause SIGPIPE. So: create a permanent named pipe in the data
directory that all PostgreSQL processes keep open. When the
postmaster starts, it opens the pipe for read, then for write, then
closes it for read. It then tries to write to the pipe. If this
fails to result in SIGPIPE, then somebody else has got the thing open;
so the new postmaster should die at once. But if does get a SIGPIPE
then there are as of that moment no other readers.

Hm. That particular protocol is broken: two postmasters doing it at the same time would both pass (because neither has it open for read at the instant where they try to write). But we could possibly frob the idea until it works. Bigger question is how portable is this behavior?
I see named pipes (fifos) in SUS v2, which is our usual baseline assumption about what's portable across Unixen, so maybe it would work.
But does NFS support named pipes?

regards, tom lane

Attachments:

pg_94beta2_qnx_20140729.patchapplication/octet-stream; name=pg_94beta2_qnx_20140729.patchDownload+660-15
pg_934_qnx_20140729.patchapplication/octet-stream; name=pg_934_qnx_20140729.patchDownload+653-12
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Baker, Keith [OCDUS Non-J&J] (#9)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:

If there are existing tests I can run to ensure the QNX port meets your criteria for robust failure handling in this area I would be happy to run them.
If not, perhaps someone can provide a quick list of failure modes to consider.
As-is:
- starting of a second postmaster fails with message 'FATAL: lock file "postmaster.pid" already exists'
- Kill -9 of postmaster followed by a pg_ctl start seems to go through recovery, although the original shared memory segments hang out in /dev/shmem until reboot (that could be better).

Unfortunately, that probably proves it's broken rather than that it works.
The behavior we need is that after kill -9'ing the postmaster, subsequent
postmaster start attempts *fail* until all the original postmaster's child
processes are gone. Otherwise you end up with two independent sets of
processes scribbling on the same files (and not sharing shmem either).
Kiss consistency goodbye ...

It's possible that all the children automatically exited, especially if
you had only background processes active; but if you had a live regular
session it would not exit just because the parent process died.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#8)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On Tue, Jul 29, 2014 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think it would be good to spend some energy figuring out what to do
about this.

Well, we've been around on this multiple times before, but if we have
any new ideas, sure ...

Well, I tried to compile a more comprehensive list of possible
techniques in that email than I've seen anyone post before.

Still, it's not clear to me how we could put much faith in flock.

Yeah, after some more research, I think you're right. Apparently, as
recently as 2010, the Linux kernel transparently converted flock()
requests to fcntl()-style locks when running on NFS:

http://0pointer.de/blog/projects/locking.html

Maybe someday this will be reliable enough to use, but the odds of it
happening in the next decade don't look good.

Finally, how about named pipes? Linux says that trying to open a
named pipe for write when there are no readers will return ENXIO, and
attempting to write to an already-open pipe with no remaining readers
will cause SIGPIPE. So: create a permanent named pipe in the data
directory that all PostgreSQL processes keep open. When the
postmaster starts, it opens the pipe for read, then for write, then
closes it for read. It then tries to write to the pipe. If this
fails to result in SIGPIPE, then somebody else has got the thing open;
so the new postmaster should die at once. But if does get a SIGPIPE
then there are as of that moment no other readers.

Hm. That particular protocol is broken: two postmasters doing it at the
same time would both pass (because neither has it open for read at the
instant where they try to write). But we could possibly frob the idea
until it works. Bigger question is how portable is this behavior?
I see named pipes (fifos) in SUS v2, which is our usual baseline
assumption about what's portable across Unixen, so maybe it would work.
But does NFS support named pipes?

Looks iffy, on a quick search. Sigh.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#11)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jul 29, 2014 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. That particular protocol is broken: two postmasters doing it at the
same time would both pass (because neither has it open for read at the
instant where they try to write). But we could possibly frob the idea
until it works. Bigger question is how portable is this behavior?
I see named pipes (fifos) in SUS v2, which is our usual baseline
assumption about what's portable across Unixen, so maybe it would work.
But does NFS support named pipes?

Looks iffy, on a quick search. Sigh.

I poked around, and it seems like a lot of the people who think it's flaky
are imagining that they should be able to use a named pipe on an NFS
server to pass data between two different machines. That doesn't work,
but it's not what we need, either. For communication between processes
on the same server, all that's needed is that the filesystem entry looks
like a pipe to the local kernel --- and that's been required NFS
functionality since RFC1813 (v3, in 1995).

So it seems like we could possibly go this route, assuming we can think
of a variant of your proposal that's race-condition-free. A disadvantage
compared to a true file lock is that it would not protect against people
trying to start postmasters from two different NFS client machines --- but
we don't have protection against that now. (Maybe we could do this *and*
do a regular file lock to offer some protection against that case, even if
it's not bulletproof?)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Tom Lane (#12)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

Robert and Tom,

Please let me know if either of you are ready to experiment with the "named pipe" idea anytime soon.
If not, I would be happy to take a crack at it, but would appreciate your expert advice to start me down the right path (files/functions to update, pseudo-code, etc.).

-Keith Baker

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
owner@postgresql.org] On Behalf Of Tom Lane
Sent: Wednesday, July 30, 2014 11:02 AM
To: Robert Haas
Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jul 29, 2014 at 7:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. That particular protocol is broken: two postmasters doing it at
the same time would both pass (because neither has it open for read
at the instant where they try to write). But we could possibly frob
the idea until it works. Bigger question is how portable is this behavior?
I see named pipes (fifos) in SUS v2, which is our usual baseline
assumption about what's portable across Unixen, so maybe it would

work.

But does NFS support named pipes?

Looks iffy, on a quick search. Sigh.

I poked around, and it seems like a lot of the people who think it's flaky are
imagining that they should be able to use a named pipe on an NFS server to
pass data between two different machines. That doesn't work, but it's not
what we need, either. For communication between processes on the same
server, all that's needed is that the filesystem entry looks like a pipe to the
local kernel --- and that's been required NFS functionality since RFC1813 (v3,
in 1995).

So it seems like we could possibly go this route, assuming we can think of a
variant of your proposal that's race-condition-free. A disadvantage
compared to a true file lock is that it would not protect against people trying
to start postmasters from two different NFS client machines --- but we don't
have protection against that now. (Maybe we could do this *and* do a
regular file lock to offer some protection against that case, even if it's not
bulletproof?)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Baker, Keith [OCDUS Non-J&J] (#13)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:

Please let me know if either of you are ready to experiment with the "named pipe" idea anytime soon.
If not, I would be happy to take a crack at it, but would appreciate your expert advice to start me down the right path (files/functions to update, pseudo-code, etc.).

Well, before we start coding anything, the first order of business would
be to think of a bulletproof locking protocol using the available pipe
operations. Robert's straw man isn't that, but it seems like there might
be one in there.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#12)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On Wed, Jul 30, 2014 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

So it seems like we could possibly go this route, assuming we can think
of a variant of your proposal that's race-condition-free. A disadvantage
compared to a true file lock is that it would not protect against people
trying to start postmasters from two different NFS client machines --- but
we don't have protection against that now. (Maybe we could do this *and*
do a regular file lock to offer some protection against that case, even if
it's not bulletproof?)

That's not a bad idea. By the way, it also wouldn't be too hard to
test at runtime whether or not flock() has first-close semantics. Not
that we'd want this exact design, but suppose you configure
shmem_interlock=flock in postgresql.conf. On startup, we test whether
flock is reliable, determine that it is, and proceed accordingly.
Now, you move your database onto an NFS volume and the semantics
change (because, hey, breaking userspace assumptions is fun) and try
to restart up your database, and it says FATAL: flock() is broken.
Now you can either move the database back, or set shmem_interlock to
some other value.

Now maybe, as you say, it's best to use multiple locking protocols and
hope that at least one will catch whatever the dangerous situation is.
I'm just trying to point out that we need not blindly assume the
semantics we want are there (or that they are not); we can check.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Robert Haas (#15)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

I will on vacation until August 11, I look forward to any progress you are able to make.

Since ensuring there are not orphaned back-end processes is vital, could we add a check for getppid() == 1 ?
Patch below seemed to work on QNX (first client command after a kill -9 of postmaster resulted in exit of its associated server process).

	diff -rdup postgresql-9.3.5/src/backend/tcop/postgres.c postgresql-9.3.5_qnx/src/backend/tcop/postgres.c
	--- postgresql-9.3.5/src/backend/tcop/postgres.c	2014-07-21 15:10:42.000000000 -0400
	+++ postgresql-9.3.5_qnx/src/backend/tcop/postgres.c	2014-07-31 18:17:40.000000000 -0400
	@@ -3967,6 +3967,14 @@ PostgresMain(int argc, char *argv[],
			 */
			firstchar = ReadCommand(&input_message);
	+#ifndef WIN32
	+		/* Check for death of parent */
	+    		if (getppid() == 1)
	+			ereport(FATAL,
	+				(errcode(ERRCODE_CRASH_SHUTDOWN),
	+				 errmsg("Parent server process has exited")));
	+#endif
	+
			/*
			 * (4) disable async signal conditions again.
			 */

Keith Baker

-----Original Message-----
From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Thursday, July 31, 2014 12:58 PM
To: Tom Lane
Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

On Wed, Jul 30, 2014 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

So it seems like we could possibly go this route, assuming we can
think of a variant of your proposal that's race-condition-free. A
disadvantage compared to a true file lock is that it would not protect
against people trying to start postmasters from two different NFS
client machines --- but we don't have protection against that now.
(Maybe we could do this *and* do a regular file lock to offer some
protection against that case, even if it's not bulletproof?)

That's not a bad idea. By the way, it also wouldn't be too hard to test at
runtime whether or not flock() has first-close semantics. Not that we'd want
this exact design, but suppose you configure shmem_interlock=flock in
postgresql.conf. On startup, we test whether flock is reliable, determine
that it is, and proceed accordingly.
Now, you move your database onto an NFS volume and the semantics
change (because, hey, breaking userspace assumptions is fun) and try to
restart up your database, and it says FATAL: flock() is broken.
Now you can either move the database back, or set shmem_interlock to
some other value.

Now maybe, as you say, it's best to use multiple locking protocols and hope
that at least one will catch whatever the dangerous situation is.
I'm just trying to point out that we need not blindly assume the semantics we
want are there (or that they are not); we can check.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Baker, Keith [OCDUS Non-J&J] (#16)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:

Since ensuring there are not orphaned back-end processes is vital, could we add a check for getppid() == 1 ?

No. Or yeah, we could, but that patch would add no security worth
mentioning. For example, someone could launch a query that runs for
many minutes, and would have plenty of time to conflict with a
subsequently-started postmaster.

Even without that issue, there's no consensus that forcibly making
orphan backends exit would be a good thing. (Some people would
like to have such an option, but the key word in that sentence is
"option".)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#17)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On Thu, Jul 31, 2014 at 9:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Baker, Keith [OCDUS Non-J&J]" <KBaker9@its.jnj.com> writes:

Since ensuring there are not orphaned back-end processes is vital, could we add a check for getppid() == 1 ?

No. Or yeah, we could, but that patch would add no security worth
mentioning. For example, someone could launch a query that runs for
many minutes, and would have plenty of time to conflict with a
subsequently-started postmaster.

True.

Even without that issue, there's no consensus that forcibly making
orphan backends exit would be a good thing. (Some people would
like to have such an option, but the key word in that sentence is
"option".)

I believe that multiple people have said multiple times that we should
change the behavior so that orphaned backends exit immediately; I
think you are the only one defending the current behavior. There are
several problems with the status quo:

1. Most seriously, once the postmaster is gone, there's nobody to
SIGQUIT remaining backends if somebody exits uncleanly. This means
that a backend running without a postmaster could be running in a
corrupt shared memory segment, which could lead to all sorts of
misbehavior, including possible data corruption.

2. Operationally, orphaned backends prevent the system from being
restarted. There's no easy, automatic way to kill them, so scripts
that automatically restart the database server if it exits don't work.
Even if letting the remaining backends continue to operate is good,
not being able to accept new connections is bad enough to completely
overshadow it. In many situations, killing them is a small price to
pay to get the system back on line.

3. Practically, the performance of any remaining backends will be
poor, because processes like the WAL writer and background writer
aren't going to be around to help any more. I think this will only
get worse over time; certainly, any future parallel query facility
won't work if the postmaster isn't around to fork new children. And
maybe we'll have other utility processes over time, too. But in any
case the situation isn't great right now, either.

Now, I don't say that any of this is a reason not to have a strong
shared memory interlock, but I'm quite unconvinced that the current
behavior should even be optional, let alone the default.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Josh Berkus
josh@agliodbs.com
In reply to: Baker, Keith [OCDUS Non-J&J] (#1)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On 08/04/2014 07:54 AM, Robert Haas wrote:

1. Most seriously, once the postmaster is gone, there's nobody to
SIGQUIT remaining backends if somebody exits uncleanly. This means
that a backend running without a postmaster could be running in a
corrupt shared memory segment, which could lead to all sorts of
misbehavior, including possible data corruption.

I've seen this in the field.

2. Operationally, orphaned backends prevent the system from being
restarted. There's no easy, automatic way to kill them, so scripts
that automatically restart the database server if it exits don't work.

I've also seen this in the field.

Now, I don't say that any of this is a reason not to have a strong
shared memory interlock, but I'm quite unconvinced that the current
behavior should even be optional, let alone the default.

I always assumed that the current behavior existed because we *couldn't*
fix it, not because anybody wanted it.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#18)
Re: Proposal to add a QNX 6.5 port to PostgreSQL

On 2014-08-04 10:54:25 -0400, Robert Haas wrote:

On Thu, Jul 31, 2014 at 9:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Even without that issue, there's no consensus that forcibly making
orphan backends exit would be a good thing. (Some people would
like to have such an option, but the key word in that sentence is
"option".)

I believe that multiple people have said multiple times that we should
change the behavior so that orphaned backends exit immediately; I
think you are the only one defending the current behavior. There are
several problems with the status quo:

+1. I think the current behaviour is a seriously bad idea.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#22)
#24Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#23)
#25Noah Misch
noah@leadboat.com
In reply to: Andres Freund (#24)
#26Stephen Frost
sfrost@snowman.net
In reply to: Noah Misch (#25)
#27Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Stephen Frost (#26)
#28Andres Freund
andres@anarazel.de
In reply to: Noah Misch (#25)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#21)
In reply to: Robert Haas (#15)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Baker, Keith [OCDUS Non-J&J] (#30)
In reply to: Tom Lane (#31)
In reply to: Tom Lane (#12)
#34Robert Haas
robertmhaas@gmail.com
In reply to: Baker, Keith [OCDUS Non-J&J] (#33)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#34)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#35)
#37Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#36)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#37)
#39Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#38)
In reply to: Robert Haas (#36)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Baker, Keith [OCDUS Non-J&J] (#40)
In reply to: Robert Haas (#41)
#43Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Baker, Keith [OCDUS Non-J&J] (#42)
In reply to: Alvaro Herrera (#43)
#45Andres Freund
andres@anarazel.de
In reply to: Baker, Keith [OCDUS Non-J&J] (#44)
#46Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#2)
In reply to: Andres Freund (#45)
#48Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Baker, Keith [OCDUS Non-J&J] (#47)
#49Noah Misch
noah@leadboat.com
In reply to: Andres Freund (#46)
#50Andres Freund
andres@anarazel.de
In reply to: Noah Misch (#49)
#51Noah Misch
noah@leadboat.com
In reply to: Andres Freund (#50)
#52Andres Freund
andres@anarazel.de
In reply to: Baker, Keith [OCDUS Non-J&J] (#47)
#53Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#52)
In reply to: Alvaro Herrera (#53)
#55Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#53)