HEADS UP: Win32/OS2/BeOS native ports

Started by Marc G. Fournierover 23 years ago80 messages
#1Marc G. Fournier
scrappy@hub.org

Morning all ...

Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...

The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...

#2mlw
markw@mohawksoft.com
In reply to: Marc G. Fournier (#1)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" wrote:

Morning all ...

Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...

The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...

If you want any assistance, drop me an email. I spent a long time (> decade)
doing Windows applications and drivers and know a good number of the cool
tricks.

#3Marc G. Fournier
scrappy@hub.org
In reply to: mlw (#2)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Fri, 3 May 2002, mlw wrote:

"Marc G. Fournier" wrote:

Morning all ...

Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...

The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...

If you want any assistance, drop me an email. I spent a long time (> decade)
doing Windows applications and drivers and know a good number of the cool
tricks.

hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...

but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)

#4Travis Hoyt
thoyt@npc.net
In reply to: Marc G. Fournier (#3)
Re: HEADS UP: Win32/OS2/BeOS native ports

Will there really be a need for a BeOS development with the sale of Be to
Palm? Is BeOS even still available? It might not be worth the time to
develop for BeOS until you see what Palm decides to do with the software.

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Marc G. Fournier
Sent: Friday, May 03, 2002 9:48 AM
To: mlw
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

On Fri, 3 May 2002, mlw wrote:

"Marc G. Fournier" wrote:

Morning all ...

Just a heads up that over the next little while, I'm planning

on

making a bunch of commits in order to work on making the code able to

work

natively in the above environments ... my work will mostly focus on

Win32

(since I have no OS2/BeOS installs), but alot of the changes will be

such

that it will benefit the others as well ...

The initial changes will be to just wrapper all our shared

memory

code, so that I can make use of Apache's libapr libraries *if* they

are

installed ... if not, it will just fall back to "the current code" ...

If you want any assistance, drop me an email. I spent a long time (>

decade)

doing Windows applications and drivers and know a good number of the

cool

tricks.

hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...

but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#5Marc G. Fournier
scrappy@hub.org
In reply to: Travis Hoyt (#4)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Fri, 3 May 2002, Travis Hoyt wrote:

Will there really be a need for a BeOS development with the sale of Be to
Palm? Is BeOS even still available? It might not be worth the time to
develop for BeOS until you see what Palm decides to do with the software.

Note that the changes I'm making are to make use of what is available
through the libapr API that the Apache group has developed ... so, as long
as they have the hooks in for BeOS, we will ... doesn't mean PgSQL will
actually have makefiles for, and will compile under it, unless someone
*with* BeOS steps forward, but alot of the core functionality that has
held back native ports should work ...

Show quoted text

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Marc G. Fournier
Sent: Friday, May 03, 2002 9:48 AM
To: mlw
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

On Fri, 3 May 2002, mlw wrote:

"Marc G. Fournier" wrote:

Morning all ...

Just a heads up that over the next little while, I'm planning

on

making a bunch of commits in order to work on making the code able to

work

natively in the above environments ... my work will mostly focus on

Win32

(since I have no OS2/BeOS installs), but alot of the changes will be

such

that it will benefit the others as well ...

The initial changes will be to just wrapper all our shared

memory

code, so that I can make use of Apache's libapr libraries *if* they

are

installed ... if not, it will just fall back to "the current code" ...

If you want any assistance, drop me an email. I spent a long time (>

decade)

doing Windows applications and drivers and know a good number of the

cool

tricks.

hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...

but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marc G. Fournier (#1)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" <scrappy@hub.org> writes:

The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...

I think we should redesign the shared memory API (and even more so the
semaphore API), not just put a wrapper layer on it. A lot of the
internal API is unnecessarily dependent on SysV shmem/sem behavior.

Note however that there are some things you will break if you are not
very careful. We are depending on shmem/sem behavior to catch a number
of multiple-postmaster conflict situations. If there's not a more or
less SysV-ish kernel underneath us, those situations will have to be
rethought and some other interlock invented.

In short, I want to see a design review first, not a bunch of
off-the-cuff commits.

regards, tom lane

#7Justin Clift
justin@postgresql.org
In reply to: Marc G. Fournier (#3)
Re: HEADS UP: Win32/OS2/BeOS native ports

Hi Marc,

How about using Dev-C++?

It's a Windows IDE with a GCC backend, and has a nice rep (and a Linux
port):

http://sourceforge.net/projects/dev-cpp/

It's always in SF.net's "Top 10" most worked on projects too, with about
roughly 7,000 downloads per day. It can generate mingwin code too.

:-)

Regards and best wishes,

Justin Clift

"Marc G. Fournier" wrote:

On Fri, 3 May 2002, mlw wrote:

"Marc G. Fournier" wrote:

Morning all ...

Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...

The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...

If you want any assistance, drop me an email. I spent a long time (> decade)
doing Windows applications and drivers and know a good number of the cool
tricks.

hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...

but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi

#8Marc G. Fournier
scrappy@hub.org
In reply to: Tom Lane (#6)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Fri, 3 May 2002, Tom Lane wrote:

"Marc G. Fournier" <scrappy@hub.org> writes:

The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...

I think we should redesign the shared memory API (and even more so the
semaphore API), not just put a wrapper layer on it. A lot of the
internal API is unnecessarily dependent on SysV shmem/sem behavior.

Note however that there are some things you will break if you are not
very careful. We are depending on shmem/sem behavior to catch a number
of multiple-postmaster conflict situations. If there's not a more or
less SysV-ish kernel underneath us, those situations will have to be
rethought and some other interlock invented.

In short, I want to see a design review first, not a bunch of
off-the-cuff commits.

All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...

Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marc G. Fournier (#8)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" <scrappy@hub.org> writes:

All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...

Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...

Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.

regards, tom lane

#10Marc G. Fournier
scrappy@hub.org
In reply to: Tom Lane (#9)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Fri, 3 May 2002, Tom Lane wrote:

"Marc G. Fournier" <scrappy@hub.org> writes:

All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...

Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...

Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.

Will investigate this ... my immediate goal is to just get it so that an
alternate library can be used ... default behaviour will be to stick with
our current function calls ... to use libapr, you will/would have to use a
configure option for it (sorry, meant --enable above, not --disable) ...

The only '#ifdef's I'm planning on for this will be in a central shmem.*
file(s), so there isn't going to be a string of those all over the place
or anything stupid like that ...

#11mlw
markw@mohawksoft.com
In reply to: Marc G. Fournier (#8)
Re: HEADS UP: Win32/OS2/BeOS native ports

Tom Lane wrote:

"Marc G. Fournier" <scrappy@hub.org> writes:

All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...

Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...

Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.

I am not familiar with the Apache code, but I see no reason why all the
features in SysV SHM should not be implementable in a Windows modules. IMHO
that's what should be done.

#12mlw
markw@mohawksoft.com
In reply to: Marc G. Fournier (#10)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" wrote:

On Fri, 3 May 2002, Tom Lane wrote:

"Marc G. Fournier" <scrappy@hub.org> writes:

All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...

Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...

Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.

Will investigate this ... my immediate goal is to just get it so that an
alternate library can be used ... default behaviour will be to stick with
our current function calls ... to use libapr, you will/would have to use a
configure option for it (sorry, meant --enable above, not --disable) ...

The only '#ifdef's I'm planning on for this will be in a central shmem.*
file(s), so there isn't going to be a string of those all over the place
or anything stupid like that ...

I think that you should create a verbatim implementation of the SysV shared
memory API in native Win32. It may have to be a pgsysvshm.dll or something like
it, but I think it is the best possible approach.

Let me look at it, I may be able to have something pretty quick.

#13mlw
markw@mohawksoft.com
In reply to: Marc G. Fournier (#10)
Re: HEADS UP: Win32/OS2/BeOS native ports

Tom Lane wrote:

mlw <markw@mohawksoft.com> writes:

I think that you should create a verbatim implementation of the SysV
shared memory API in native Win32. It may have to be a pgsysvshm.dll
or something like it, but I think it is the best possible approach.

Let me look at it, I may be able to have something pretty quick.

The notion of redesigning the internal API shouldn't be forgotten,
though. I'm not so dissatisfied with the shmem API (mainly because
it's only relevant at startup; once we've created and attached the
shmem segment, we're done worrying about it). But the SysV semaphore
API is really kind of ugly, and the ugliness doesn't buy anything except
porting difficulty. Moreover, putting a cleaner API layer there would
make it easier to experiment with cheaper semaphore primitives, such
as POSIX mutexes.

There was a thread last fall concerning redesigning that code --- I've
forgotten the guy's name, but IIRC he wanted to make a port to QNX6,
and the sema code was getting in the way. We put the work on hold
because we were getting close to 7.2 release (or thought we were,
anyway) but the project ought to be taken up again.

I will commit to writing a windows version of what ever shm/semaphore/mutex
code you guys specify.

Show quoted text

regards, tom lane

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#12)
Re: HEADS UP: Win32/OS2/BeOS native ports

mlw <markw@mohawksoft.com> writes:

I think that you should create a verbatim implementation of the SysV
shared memory API in native Win32. It may have to be a pgsysvshm.dll
or something like it, but I think it is the best possible approach.

Let me look at it, I may be able to have something pretty quick.

The notion of redesigning the internal API shouldn't be forgotten,
though. I'm not so dissatisfied with the shmem API (mainly because
it's only relevant at startup; once we've created and attached the
shmem segment, we're done worrying about it). But the SysV semaphore
API is really kind of ugly, and the ugliness doesn't buy anything except
porting difficulty. Moreover, putting a cleaner API layer there would
make it easier to experiment with cheaper semaphore primitives, such
as POSIX mutexes.

There was a thread last fall concerning redesigning that code --- I've
forgotten the guy's name, but IIRC he wanted to make a port to QNX6,
and the sema code was getting in the way. We put the work on hold
because we were getting close to 7.2 release (or thought we were,
anyway) but the project ought to be taken up again.

regards, tom lane

#15mlw
markw@mohawksoft.com
In reply to: Marc G. Fournier (#10)
Re: HEADS UP: Win32/OS2/BeOS native ports

sysv shm/sem

I am writing a Win32 DLL implementation of :

int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);
int shmctl(int shmid, int cmd, struct shmid_ds *buf);
int shmget(key_t key, int size, int shmflg);
void * shmat(int shmid, const void *shmaddr, int shmfl);
int shmdt(const void *shmaddr);

I will donate it do PostgreSQL.

UNIX permissions will be ignored, i.e. uig/gid will be 0
Do you see any need for the msgxxx calls?
Is the function ipc() ever used?

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#15)
Re: HEADS UP: Win32/OS2/BeOS native ports

mlw <markw@mohawksoft.com> writes:

UNIX permissions will be ignored, i.e. uig/gid will be 0

Win32 has no security anyway, right? ;-)

Do you see any need for the msgxxx calls?
Is the function ipc() ever used?

Nope, and nope.

regards, tom lane

#17Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Marc G. Fournier (#10)
Re: HEADS UP: Win32/OS2/BeOS native ports

mlw <markw@mohawksoft.com> writes:

I think that you should create a verbatim implementation of the SysV
shared memory API in native Win32. It may have to be a pgsysvshm.dll
or something like it, but I think it is the best possible approach.

Let me look at it, I may be able to have something pretty quick.

The notion of redesigning the internal API shouldn't be forgotten,
though. I'm not so dissatisfied with the shmem API (mainly because
it's only relevant at startup; once we've created and attached the
shmem segment, we're done worrying about it). But the SysV semaphore
API is really kind of ugly, and the ugliness doesn't buy anything except
porting difficulty. Moreover, putting a cleaner API layer there would
make it easier to experiment with cheaper semaphore primitives, such
as POSIX mutexes.

There was a thread last fall concerning redesigning that code --- I've
forgotten the guy's name, but IIRC he wanted to make a port to QNX6,

That would be me.

and the sema code was getting in the way. We put the work on hold
because we were getting close to 7.2 release (or thought we were,
anyway) but the project ought to be taken up again.

Yes, I am intended to give it another spin soon. I think it is bad idea to
impose SysV ugliness on systems which have better solutions. Main problem
with SysV primitives is that they are 'sticky' (i.e., not cleaned up if
process dies/exits by the system). So Postgres has to deal with issues like
discovering leftovers, finding unused IPC keys, etc. It is inelegant and
takes up lot of code. POSIX primitives are anonymous and cleaned up
automatically. So you just say 'give me a semaphore' and you get it, nothing
gets into your way.

Performance of POSIX mutexes and semaphores (on platforms where they are
implemented properly) is also better than SysV semaphores. Unfortunately
some systems have rather lame POSIX support, for example semaphores and
mutexes can't be shared across processes on Linux. That's basically the
reason why people keep sticking to SysV.

What really need to be done is new abstraction layer which would cover SysV
API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I almost
did it last time...

-- igor

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#15)
Re: HEADS UP: Win32/OS2/BeOS native ports

mlw <markw@mohawksoft.com> writes:

I am writing a Win32 DLL implementation of :

int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

After looking over the uses of these functions, I believe that we could
easily develop a non-SysV-centric internal API. Here's a first cut:

1. Define a struct type PGSemaphore that has implementation-specific
contents (the generic code will never look inside it). Operations on
semaphores will take "PGSemaphore *" arguments. When implementing atop
SysV semaphores, PGSemaphore will contain two fields, the semaphore id
and semaphore number. In other cases the contents could be different.

2. All PGSemaphore structs will be physically stored in shared memory.
This doesn't matter for SysV support, where the id/number are constants
anyway; but it will allow implementations based on mutexes.

3. The operations needed are

* Reserve semaphores. This will be told the number of semaphores
needed. On SysV it will do the necessary semget()s, but on some
implementations it might be a no-op. This should also be prepared
to clean up after a failed postmaster, if it is possible for sema
resources to outlive the creating postmaster.

* Create semaphore. Given a pointer to an uninitialized PGSemaphore
struct, initialize it to a new semaphore with count 1. (On SysV this
would hand out the individual semas previously allocated by Reserve.)
Note that this is not responsible for allocating the memory occupied
by the PGSemaphore struct --- I envision the structs being part of
larger objects such as PROC structures.

* Release semaphores. Release all resources allocated by previous
Reserve and Create operations. This is called when shutting down
or when resetting shared memory after a backend crash.

* Reset semaphore. Reset an existing PGSemaphore to count zero.

* Lock semaphore. Identical to current IpcSemaphoreLock(), except
parameter is a PGSemaphore *. See code of that routine for detailed
semantics.

* Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except
parameter is a PGSemaphore *.

* Conditional lock semaphore. Identical to current
IpcSemaphoreTryLock(), except parameter is a PGSemaphore *.

Reserve/create/release would all be called in the postmaster process,
so they could communicate via malloc'd private memory (eg, an array
of semaphore IDs would be needed in the SysV case). The remaining
operations would be invokable by any backend.

Comments?

I'd be willing to work on refactoring the existing SysV-based code
to meet this spec.

regards, tom lane

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Igor Kovalenko (#17)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

What really need to be done is new abstraction layer which would cover SysV
API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I almost
did it last time...

Yes. I just sent off a proposal for a cleaner semaphore API --- please
comment on it.

My inclination is to stick with the SysV API for shared memory, however.
The "stickiness" is actually not a bad thing for us in the shared memory
case, because it allows a new postmaster to detect the situation where
old backends are still running: it can see that there is an old shmem
segment still present with attached processes. Without that, we have no
good defense against the scenario where an old postmaster dumped core
leaving backends still running. The backends are fine as long as they
are left to finish out their operations, or even killed with whatever
degree of prejudice the admin wants. But what we must *not* do is allow
a new postmaster to start while the old backends are still running;
that would mean two sets of backends running without contact with each
other, which would be fatal for data integrity. The SysV API lets us
detect that case, but I don't see any equally good way to do it if we
are using anonymous shared memory.

regards, tom lane

#20mlw
markw@mohawksoft.com
In reply to: Marc G. Fournier (#10)
Re: HEADS UP: Win32/OS2/BeOS native ports

Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.

That being said, a SysV IPC interface for native Windows would be kind of cool
to have.

Tom Lane wrote:

Show quoted text

mlw <markw@mohawksoft.com> writes:

I am writing a Win32 DLL implementation of :

int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

After looking over the uses of these functions, I believe that we could
easily develop a non-SysV-centric internal API. Here's a first cut:

1. Define a struct type PGSemaphore that has implementation-specific
contents (the generic code will never look inside it). Operations on
semaphores will take "PGSemaphore *" arguments. When implementing atop
SysV semaphores, PGSemaphore will contain two fields, the semaphore id
and semaphore number. In other cases the contents could be different.

2. All PGSemaphore structs will be physically stored in shared memory.
This doesn't matter for SysV support, where the id/number are constants
anyway; but it will allow implementations based on mutexes.

3. The operations needed are

* Reserve semaphores. This will be told the number of semaphores
needed. On SysV it will do the necessary semget()s, but on some
implementations it might be a no-op. This should also be prepared
to clean up after a failed postmaster, if it is possible for sema
resources to outlive the creating postmaster.

* Create semaphore. Given a pointer to an uninitialized PGSemaphore
struct, initialize it to a new semaphore with count 1. (On SysV this
would hand out the individual semas previously allocated by Reserve.)
Note that this is not responsible for allocating the memory occupied
by the PGSemaphore struct --- I envision the structs being part of
larger objects such as PROC structures.

* Release semaphores. Release all resources allocated by previous
Reserve and Create operations. This is called when shutting down
or when resetting shared memory after a backend crash.

* Reset semaphore. Reset an existing PGSemaphore to count zero.

* Lock semaphore. Identical to current IpcSemaphoreLock(), except
parameter is a PGSemaphore *. See code of that routine for detailed
semantics.

* Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except
parameter is a PGSemaphore *.

* Conditional lock semaphore. Identical to current
IpcSemaphoreTryLock(), except parameter is a PGSemaphore *.

Reserve/create/release would all be called in the postmaster process,
so they could communicate via malloc'd private memory (eg, an array
of semaphore IDs would be needed in the SysV case). The remaining
operations would be invokable by any backend.

Comments?

I'd be willing to work on refactoring the existing SysV-based code
to meet this spec.

regards, tom lane

#21Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Marc G. Fournier (#10)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

What really need to be done is new abstraction layer which would cover

SysV

API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I

almost

did it last time...

Yes. I just sent off a proposal for a cleaner semaphore API --- please
comment on it.

I will look. I remember from my last attempt that it actually did not
involve a lot of changes in your existing abstraction layer (which already
exists, just being SysV-centric). I believe only one function prototype had
to be changed... Your proposal sounds like more changes will be needed...

My inclination is to stick with the SysV API for shared memory, however.
The "stickiness" is actually not a bad thing for us in the shared memory
case, because it allows a new postmaster to detect the situation where
old backends are still running: it can see that there is an old shmem
segment still present with attached processes. Without that, we have no
good defense against the scenario where an old postmaster dumped core
leaving backends still running. The backends are fine as long as they
are left to finish out their operations, or even killed with whatever
degree of prejudice the admin wants. But what we must *not* do is allow
a new postmaster to start while the old backends are still running;
that would mean two sets of backends running without contact with each
other, which would be fatal for data integrity. The SysV API lets us
detect that case, but I don't see any equally good way to do it if we
are using anonymous shared memory.

It does not have to be anonymous. POSIX also defines shm_open(same arguments
as open) API which will create named object in whatever location corresponds
to shared memory storage on that platform (object is then grown to needed
size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to shm_open()
with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).

I suggest we do IPC abstraction which would cover shared memory as well as
semaphores, otherwise it will be only half of solution - platforms without
SysV API would still have to emulate SysV shared memory.

-- igor

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Igor Kovalenko (#21)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

It does not have to be anonymous. POSIX also defines shm_open(same arguments
as open) API which will create named object in whatever location corresponds
to shared memory storage on that platform (object is then grown to needed
size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to shm_open()
with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).

Yes, but can you detect whether other processes have the same file open?

regards, tom lane

#23Matthew Kirkwood
matthew@hairy.beasts.org
In reply to: Tom Lane (#19)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Fri, 3 May 2002, Tom Lane wrote:

But what we must *not* do is allow a new postmaster to start while the
old backends are still running; that would mean two sets of backends
running without contact with each other, which would be fatal for data
integrity. The SysV API lets us detect that case, but I don't see any
equally good way to do it if we are using anonymous shared memory.

It's a hack (and has slight security implications), but you
could just allow the postgres backends to keep the listening
socket(s) open.

Matthew.

#24Joel Burton
joel@joelburton.com
In reply to: Tom Lane (#18)
Re: HEADS UP: Win32/OS2/BeOS native ports

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Friday, May 03, 2002 6:07 PM
To: mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

- J.

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joel Burton (#24)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Joel Burton" <joel@joelburton.com> writes:

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

Was the problem just with semas, or was shmem an issue too?

In any case, unless someone actually writes an alternative sema
implementation that will work on BSD, nothing will happen...

regards, tom lane

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew Kirkwood (#23)
Re: HEADS UP: Win32/OS2/BeOS native ports

Matthew Kirkwood <matthew@hairy.beasts.org> writes:

On Fri, 3 May 2002, Tom Lane wrote:

The SysV API lets us detect that case, but I don't see any
equally good way to do it if we are using anonymous shared memory.

It's a hack (and has slight security implications), but you
could just allow the postgres backends to keep the listening
socket(s) open.

Hmm. That might be workable, but it feels shaky to me. The problem
is that you are using a lock based on port number to interlock a data
directory --- and port number and data directory are independently
variable parameters. Consider
$ postmaster -D /my/dir &
-- dba thinks "oops, forgot to specify port"
$ kill -9 pm-pid # bad idea
$ postmaster -D /my/dir -p myport &
Any backends started by the first postmaster will not be noticed by
the second one, if the interlock is based on port number.

We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.

regards, tom lane

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Igor Kovalenko (#21)
Re: HEADS UP: Win32/OS2/BeOS native ports

I have just committed changes to create a platform-independent internal
API for semaphores, along the lines discussed yesterday.

At this point, the Darwin (Mac OS X), BeOS, and QNX4 ports are probably
broken. I will fix the Darwin port (probably not till tomorrow though);
volunteers to clean up the BeOS and QNX4 ports are needed.

BTW, there is a quick hack attempt at a POSIX-semaphore-based
implementation in src/backend/port/posix_sema.c. I have not tested
this yet, but expect to do so as part of fixing the Darwin port.

regards, tom lane

#28Joel Burton
joel@joelburton.com
In reply to: Tom Lane (#25)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Joel Burton" <joel@joelburton.com> writes:

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

Was the problem just with semas, or was shmem an issue too?

Not sure -- doesn't get far enough for me to tell. initdb dies with:

creating template1 database in /usr/local/pgsql/data/base/1...
IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed:
Function not implemented

In any case, unless someone actually writes an alternative sema
implementation that will work on BSD, nothing will happen...

Was hoping that the discussions about the APR might let this work under BSD
jails, assuming I can get the APR to compile.

(For others: apparently PG will work under BSD jails if you recompile the
BSD kernel w/some new settings, but my ISP for this project was unwilling to
do that. Search the mailing list for messages on how to do this.)

J.

#29Dann Corbit
DCorbit@connx.com
In reply to: Joel Burton (#28)
Re: HEADS UP: Win32/OS2/BeOS native ports

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Friday, May 03, 2002 3:07 PM
To: mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

mlw <markw@mohawksoft.com> writes:

I am writing a Win32 DLL implementation of :

int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);

Rather than propagating the SysV semaphore API still further,
why don't
we kill it now? (I'm willing to keep the shmem API, however.)

After looking over the uses of these functions, I believe
that we could
easily develop a non-SysV-centric internal API. Here's a first cut:

1. Define a struct type PGSemaphore that has implementation-specific
contents (the generic code will never look inside it). Operations on
semaphores will take "PGSemaphore *" arguments. When
implementing atop
SysV semaphores, PGSemaphore will contain two fields, the semaphore id
and semaphore number. In other cases the contents could be different.

2. All PGSemaphore structs will be physically stored in
shared memory.
This doesn't matter for SysV support, where the id/number are
constants
anyway; but it will allow implementations based on mutexes.

3. The operations needed are

* Reserve semaphores. This will be told the number of semaphores
needed. On SysV it will do the necessary semget()s, but on some
implementations it might be a no-op. This should also be prepared
to clean up after a failed postmaster, if it is possible for sema
resources to outlive the creating postmaster.

* Create semaphore. Given a pointer to an uninitialized PGSemaphore
struct, initialize it to a new semaphore with count 1. (On SysV this
would hand out the individual semas previously allocated by Reserve.)
Note that this is not responsible for allocating the memory occupied
by the PGSemaphore struct --- I envision the structs being part of
larger objects such as PROC structures.

* Release semaphores. Release all resources allocated by previous
Reserve and Create operations. This is called when shutting down
or when resetting shared memory after a backend crash.

* Reset semaphore. Reset an existing PGSemaphore to count zero.

* Lock semaphore. Identical to current IpcSemaphoreLock(), except
parameter is a PGSemaphore *. See code of that routine for detailed
semantics.

* Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except
parameter is a PGSemaphore *.

* Conditional lock semaphore. Identical to current
IpcSemaphoreTryLock(), except parameter is a PGSemaphore *.

Reserve/create/release would all be called in the postmaster process,
so they could communicate via malloc'd private memory (eg, an array
of semaphore IDs would be needed in the SysV case). The remaining
operations would be invokable by any backend.

Comments?

I'd be willing to work on refactoring the existing SysV-based code
to meet this spec.

It's already been done. Here is a freely available C++ implementation
(licensing similar to PostgreSQL):
http://www.cs.wustl.edu/~schmidt/ACE.html

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joel Burton (#28)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Joel Burton" <joel@joelburton.com> writes:

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

Was the problem just with semas, or was shmem an issue too?

Not sure -- doesn't get far enough for me to tell. initdb dies with:

creating template1 database in /usr/local/pgsql/data/base/1...
IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed:
Function not implemented

We create shared memory before semaphores, so if you got this far then
the shmem code is probably working (at least minimally).

Do you have working sem_open or sem_init (ie, POSIX semaphores)?

regards, tom lane

#31Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Joel Burton (#24)
Re: HEADS UP: Win32/OS2/BeOS native ports

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

I have postgresql working quite happily in FreeBSD jails! (Just make sure
you go "sysctl jail.sysvipc_allowed=1").

Chris

#32Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Joel Burton (#28)
Re: HEADS UP: Win32/OS2/BeOS native ports

(For others: apparently PG will work under BSD jails if you recompile the
BSD kernel w/some new settings, but my ISP for this project was
unwilling to
do that. Search the mailing list for messages on how to do this.)

Works fine. You don't need to recompile - just use the sysctl.

Chris

#33Christof Petig
christof@petig-baender.de
In reply to: Marc G. Fournier (#3)
Re: HEADS UP: Win32/OS2/BeOS native ports

Marc G. Fournier wrote:

hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...

but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)

I think if you are familiar with make and gcc (and perhaps autoconf),
MinGW and MSys are the development environment of choice on Windows. You
even get /bin/sh. But the generated program does not depend on any
custom library (like cygwin does). It's even possible to cross compile
from a Linux box (actully powerpc in my case).

Look at http://mingw.sourceforge.net (and there for msys).

Christof

#34Joel Burton
joel@joelburton.com
In reply to: Christopher Kings-Lynne (#31)
Re: HEADS UP: Win32/OS2/BeOS native ports

Rather than propagating the SysV semaphore API still further,

why don't

we kill it now? (I'm willing to keep the shmem API, however.)

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

I have postgresql working quite happily in FreeBSD jails! (Just make sure
you go "sysctl jail.sysvipc_allowed=1").

Yep, Alastair D'Silva helpfully pointed this out a month or two ago, and for
many people, this would be a workable solution. Unfortunately, it appears
that you have to run this command outside the jail, which I don't have
access to.

I forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:

"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."

And therefore they refused to make the change. (More annoyingly, they kept
trying to convince me that I should quit my whining and use MySQL since it's
"ACID compliant").

So, I'm holding out hope that since this ISP seems unenlightened, one day
PostgreSQL will simply run in BSD jails without a cooperating jailmaster,
and it sounded like using the APR _might_ make this possible. (All of my
other projects use PG; I'd sure love to get this one switched over!)

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#35Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Joel Burton (#34)
Re: HEADS UP: Win32/OS2/BeOS native ports

I forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:

"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."

Not true. But I'll avoid digging up any more on that old issue...

Chris

#36Marc G. Fournier
scrappy@hub.org
In reply to: Joel Burton (#24)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Sat, 4 May 2002, Joel Burton wrote:

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Friday, May 03, 2002 6:07 PM
To: mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

There is no problem with SysV IPC in the jail, per se ... jail's were just
not coded to delimite/segregate such IPC from other jails ... its one of
those "caveat empor"(sp?) situations ... you can do it, but at your own
risk, as somoene in another jail has the ability to 'attach' to your
segments ...

#37Joel Burton
joel@joelburton.com
In reply to: Christopher Kings-Lynne (#35)
Re: HEADS UP: Win32/OS2/BeOS native ports

-----Original Message-----
From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au]
Sent: Monday, May 06, 2002 7:36 AM
To: Joel Burton; Tom Lane; mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

I forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:

"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."

Not true. But I'll avoid digging up any more on that old issue...

Oh, I'm sure it's not true. But sometimes things end up on the "nyah, nyah,
it's my server and I say so" level. Sigh.

So, I guess that's where it leaves me: waiting for some solution other than
ISP cluefulness. :-)

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#38Marc G. Fournier
scrappy@hub.org
In reply to: Tom Lane (#26)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Sat, 4 May 2002, Tom Lane wrote:

Matthew Kirkwood <matthew@hairy.beasts.org> writes:

On Fri, 3 May 2002, Tom Lane wrote:

The SysV API lets us detect that case, but I don't see any
equally good way to do it if we are using anonymous shared memory.

It's a hack (and has slight security implications), but you
could just allow the postgres backends to keep the listening
socket(s) open.

Hmm. That might be workable, but it feels shaky to me. The problem
is that you are using a lock based on port number to interlock a data
directory --- and port number and data directory are independently
variable parameters. Consider
$ postmaster -D /my/dir &
-- dba thinks "oops, forgot to specify port"
$ kill -9 pm-pid # bad idea
$ postmaster -D /my/dir -p myport &
Any backends started by the first postmaster will not be noticed by
the second one, if the interlock is based on port number.

We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.

How about a second, data directory based socket simply named something
like '.inuse', that is not port dependent?

#39Marc G. Fournier
scrappy@hub.org
In reply to: Joel Burton (#28)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Sun, 5 May 2002, Joel Burton wrote:

"Joel Burton" <joel@joelburton.com> writes:

Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)

Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?

Was the problem just with semas, or was shmem an issue too?

Not sure -- doesn't get far enough for me to tell. initdb dies with:

creating template1 database in /usr/local/pgsql/data/base/1...
IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed:
Function not implemented

Read the jail manpage:

jail.sysvipc_allowed
This MIB entry determines whether or not processes within a jail
have access to System V IPC primitives. In the current jail imple-
mentation, System V primitives share a single namespace across the
host and jail environments, meaning that processes within a jail
would be able to communicate with (and potentially interfere with)
processes outside of the jail, and in other jails. As such, this
functionality is disabled by default, but can be enabled by setting
this MIB entry to 1.

#40Marc G. Fournier
scrappy@hub.org
In reply to: Joel Burton (#37)
Re: HEADS UP: Win32/OS2/BeOS native ports

Or changing ISPs to a place more enlightened ...

On Mon, 6 May 2002, Joel Burton wrote:

Show quoted text

-----Original Message-----
From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au]
Sent: Monday, May 06, 2002 7:36 AM
To: Joel Burton; Tom Lane; mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

I forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:

"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."

Not true. But I'll avoid digging up any more on that old issue...

Oh, I'm sure it's not true. But sometimes things end up on the "nyah, nyah,
it's my server and I say so" level. Sigh.

So, I guess that's where it leaves me: waiting for some solution other than
ISP cluefulness. :-)

- J.

Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant

#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marc G. Fournier (#38)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" <scrappy@hub.org> writes:

We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.

How about a second, data directory based socket simply named something
like '.inuse', that is not port dependent?

Hmm ... but how do you use that to tell if there are still backends
around?

regards, tom lane

#42Marc G. Fournier
scrappy@hub.org
In reply to: Tom Lane (#41)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Mon, 6 May 2002, Tom Lane wrote:

"Marc G. Fournier" <scrappy@hub.org> writes:

We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.

How about a second, data directory based socket simply named something
like '.inuse', that is not port dependent?

Hmm ... but how do you use that to tell if there are still backends
around?

As a backend is started up, connect to that socket ... if socket is open
when trying to start a new frontend, fail as there are currently other
connections attached to it?

#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marc G. Fournier (#42)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" <scrappy@hub.org> writes:

Hmm ... but how do you use that to tell if there are still backends
around?

As a backend is started up, connect to that socket ... if socket is open
when trying to start a new frontend, fail as there are currently other
connections attached to it?

But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?

ISTM we gave up on exactly that technique for the main postmaster's
socket; we now create a separate lockfile to protect the socket, and
don't rely on the socket itself to give us any interlocking help at all.
But the lockfile just contains the postmaster's PID, so it's no help
in detecting the case where the old postmaster has gone away but there
are still orphaned backends laying about.

I'm not entirely thrilled with the lockfile technique; it'd be nice to
find something better. (In particular, we've seen a couple cases now
where people had trouble with PG refusing to start after a system
reboot, because some other daemon process had been assigned the PID
that the postmaster had in its previous incarnation; so the lockfile
check code mistakenly thinks there's still an old postmaster.) But
so far, the only thing worse than lockfiles is everything else :-(

regards, tom lane

#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#43)
Re: HEADS UP: Win32/OS2/BeOS native ports

I said:

But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?

Oh, I take that back, I see how you could do it: the postmaster opens
the socket *for writing*, but never actually writes. All its child
processes inherit that same open file descriptor and just keep it
around. Then, to tell if anyone's home, you open the socket *for
reading* and try to read in O_NONBLOCK mode. You get an EOF indication
if and only if no one has the socket open for writing; otherwise you
get an EAGAIN error.

That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.

regards, tom lane

#45Marc G. Fournier
scrappy@hub.org
In reply to: Tom Lane (#44)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Mon, 6 May 2002, Tom Lane wrote:

I said:

But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?

Oh, I take that back, I see how you could do it: the postmaster opens
the socket *for writing*, but never actually writes. All its child
processes inherit that same open file descriptor and just keep it
around. Then, to tell if anyone's home, you open the socket *for
reading* and try to read in O_NONBLOCK mode. You get an EOF indication
if and only if no one has the socket open for writing; otherwise you
get an EAGAIN error.

That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.

Wouldn't the same thing work with a simple file? Does it have to be a
UnixDomainSocket?

#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marc G. Fournier (#45)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" <scrappy@hub.org> writes:

That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.

Wouldn't the same thing work with a simple file? Does it have to be a
UnixDomainSocket?

No, and yes. If it's not a pipe/fifo then you don't get the
EOF-only-when-no-possible-writers-remain behavior. TCP and UDP
sockets don't show this sort of behavior either. So AFAICS we
really need a named pipe, ie, socket.

We could maybe do something approximately similar with TCP connection
attempts (per the prior suggestion of letting backends hold the
postmaster's listen socket open; then see if you get "connection
refused" or a timeout from trying to connect) but I don't think it'd be
as trustworthy. Simple mistakes like overly aggressive ipchains filters
would confuse this kind of test.

regards, tom lane

#47Marc G. Fournier
scrappy@hub.org
In reply to: Tom Lane (#46)
Re: HEADS UP: Win32/OS2/BeOS native ports

Since our default behavior (at startup) is to have TCP sockets disabled,
how many OSs are there that don't support UD sockets? Enough to really be
worried about?

On Mon, 6 May 2002, Tom Lane wrote:

Show quoted text

"Marc G. Fournier" <scrappy@hub.org> writes:

That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.

Wouldn't the same thing work with a simple file? Does it have to be a
UnixDomainSocket?

No, and yes. If it's not a pipe/fifo then you don't get the
EOF-only-when-no-possible-writers-remain behavior. TCP and UDP
sockets don't show this sort of behavior either. So AFAICS we
really need a named pipe, ie, socket.

We could maybe do something approximately similar with TCP connection
attempts (per the prior suggestion of letting backends hold the
postmaster's listen socket open; then see if you get "connection
refused" or a timeout from trying to connect) but I don't think it'd be
as trustworthy. Simple mistakes like overly aggressive ipchains filters
would confuse this kind of test.

regards, tom lane

#48Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marc G. Fournier (#47)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" <scrappy@hub.org> writes:

Since our default behavior (at startup) is to have TCP sockets disabled,
how many OSs are there that don't support UD sockets?

A quick look in the sources shows that we #undef HAVE_UNIX_SOCKETS for
QNX, BeOS, and old cygwin versions ... which are exactly the platforms
that don't have SysV shmem support, so those are exactly the guys who
we're trying to fix the problem for.

I do like the idea of using a Unix socket this way where available,
though. It'd let us switch over the shmem code to using IPC_PRIVATE
shmem key, which'd simplify that code tremendously; and we could make
some progress against the dead-PID-in-lockfile problem.

Could we get away with saying that the Unix-socket-less platforms have
weaker protection against mistakenly restarting the postmaster? We
could have a plain-vanilla lockfile instead of a socket lockfile on
those platforms, which would not catch the dead-postmaster-live-backends
case, but it'd be better than nothing. And I am not convinced that the
shmem-connection-count check should be trusted on QNX or BeOS, anyway,
so I'm not sure that they actually have a functioning check now.

regards, tom lane

#49Jan Wieck
janwieck@yahoo.com
In reply to: Tom Lane (#44)
Re: HEADS UP: Win32/OS2/BeOS native ports

Tom Lane wrote:

I said:

But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?

Oh, I take that back, I see how you could do it: the postmaster opens
the socket *for writing*, but never actually writes. All its child
processes inherit that same open file descriptor and just keep it
around. Then, to tell if anyone's home, you open the socket *for
reading* and try to read in O_NONBLOCK mode. You get an EOF indication
if and only if no one has the socket open for writing; otherwise you
get an EAGAIN error.

That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.

I think what you describe is a named pipe, not a socket. The
underlying implementation might be a socketpair, but the
behaviour of named pipes is exactly that since Version 7 at
least. This worked under Minix already.

regards, tom lane

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#50Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Marc G. Fournier (#47)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Marc G. Fournier" <scrappy@hub.org> writes:

Since our default behavior (at startup) is to have TCP sockets disabled,
how many OSs are there that don't support UD sockets?

A quick look in the sources shows that we #undef HAVE_UNIX_SOCKETS for
QNX, BeOS, and old cygwin versions ... which are exactly the platforms
that don't have SysV shmem support, so those are exactly the guys who
we're trying to fix the problem for.

Next release of QNX (6.2) will add support for UDS, but they are still not
quite portable.

I do like the idea of using a Unix socket this way where available,
though. It'd let us switch over the shmem code to using IPC_PRIVATE
shmem key, which'd simplify that code tremendously; and we could make
some progress against the dead-PID-in-lockfile problem.

Could we get away with saying that the Unix-socket-less platforms have
weaker protection against mistakenly restarting the postmaster? We
could have a plain-vanilla lockfile instead of a socket lockfile on
those platforms, which would not catch the dead-postmaster-live-backends
case, but it'd be better than nothing. And I am not convinced that the
shmem-connection-count check should be trusted on QNX or BeOS, anyway,
so I'm not sure that they actually have a functioning check now.

Why can't we use named pipe (aka FIFO file) instead of UDS? I think that is
more portable... The socketpair() function also tends to be more portable
than whole UDS in general... It works on QNX4 even, but not sure about BeOS.

Another thought is, why can't we use bind() to the postmaster port to detect
other postmasters? I might be missing something, so pardon by ignorance. But
should not bind() to same port fail with EADDRINUSE unless SO_REUSEADDR is
set? I don't really know if it is set in postgres or not ...

-- igor

#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Igor Kovalenko (#50)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

Could we get away with saying that the Unix-socket-less platforms have
weaker protection against mistakenly restarting the postmaster?

Why can't we use named pipe (aka FIFO file) instead of UDS?

That's exactly what I'm talking about.

Another thought is, why can't we use bind() to the postmaster port to detect
other postmasters?

Because port number and data directory are independent parameters. The
interlock on port number is not related to the interlock on data
directory.

regards, tom lane

#52Matthew Kirkwood
matthew@hairy.beasts.org
In reply to: Tom Lane (#43)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Mon, 6 May 2002, Tom Lane wrote:

As a backend is started up, connect to that socket ... if socket is open
when trying to start a new frontend, fail as there are currently other
connections attached to it?

But the backends would only have the socket open, they'd not be
actively listening to it. So how could you tell whether anyone
had the socket open or not?

It's easy. As startup, the postmaster (or standalone
backend) creates a Unix socket, binds it to the filename
and calls listen on it.

If another backend is running, it'll get EADDRINUSE from
the bind or listen.

Nobody actually needs to connect to the socket. Simple,
race-free, 10 lines of code.

Matthew.

#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew Kirkwood (#52)
Re: HEADS UP: Win32/OS2/BeOS native ports

Matthew Kirkwood <matthew@hairy.beasts.org> writes:

Nobody actually needs to connect to the socket. Simple,
race-free, 10 lines of code.

... and we already do it. But it protects the port number, not
the data directory.

regards, tom lane

#54Matthew Kirkwood
matthew@hairy.beasts.org
In reply to: Tom Lane (#53)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Tue, 7 May 2002, Tom Lane wrote:

Nobody actually needs to connect to the socket. Simple,
race-free, 10 lines of code.

... and we already do it. But it protects the port number, not
the data directory.

If I understood him correctly, Marc was suggesting a further
domain socket inside the data directory.

Matthew.

#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew Kirkwood (#54)
Re: HEADS UP: Win32/OS2/BeOS native ports

Matthew Kirkwood <matthew@hairy.beasts.org> writes:

... and we already do it. But it protects the port number, not
the data directory.

If I understood him correctly, Marc was suggesting a further
domain socket inside the data directory.

Right, and that would work because we would reference it as
$PGDATA/.socket --- exact, one-to-one correspondence between data
directory and interlock file. A TCP socket isn't going to have any
such direct connection to the data directory.

We could try to make such a connection (eg, pick a free port number at
random, and record the number in a lockfile in $PGDATA). But that will
suffer from a bunch of failure modes, starting with the same one that's
been biting us for PID interlocking: after a system restart, someone
else may hold the port number that we chose at random last time.

Basically, the reason that we want this interlock is because we are
going after five-nines kind of reliability. An interlock technology
that's not itself five-nines reliable isn't going to make things better.

regards, tom lane

#56Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Matthew Kirkwood (#54)
Re: HEADS UP: Win32/OS2/BeOS native ports

Just a friendly reminder that it should be named pipe rather than UDS ;)
-- igor

Show quoted text

Matthew Kirkwood <matthew@hairy.beasts.org> writes:

... and we already do it. But it protects the port number, not
the data directory.

If I understood him correctly, Marc was suggesting a further
domain socket inside the data directory.

Right, and that would work because we would reference it as
$PGDATA/.socket --- exact, one-to-one correspondence between data
directory and interlock file. A TCP socket isn't going to have any
such direct connection to the data directory.

We could try to make such a connection (eg, pick a free port number at
random, and record the number in a lockfile in $PGDATA). But that will
suffer from a bunch of failure modes, starting with the same one that's
been biting us for PID interlocking: after a system restart, someone
else may hold the port number that we chose at random last time.

Basically, the reason that we want this interlock is because we are
going after five-nines kind of reliability. An interlock technology
that's not itself five-nines reliable isn't going to make things better.

regards, tom lane

#57Matthew Kirkwood
matthew@hairy.beasts.org
In reply to: Igor Kovalenko (#56)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Tue, 7 May 2002, Igor Kovalenko wrote:

Just a friendly reminder that it should be named pipe rather than UDS
;)

Named pipes don't have the required syntax. Perhaps for
platforms which have neither SysV shm, something like
POSIX named semaphores are the way forward.

Matthew.

#58Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Matthew Kirkwood (#57)
Re: HEADS UP: Win32/OS2/BeOS native ports

Can you be more specific? What required syntax? I was talking about named
pipe vs UDS socket...

Show quoted text

On Tue, 7 May 2002, Igor Kovalenko wrote:

Just a friendly reminder that it should be named pipe rather than UDS
;)

Named pipes don't have the required syntax. Perhaps for
platforms which have neither SysV shm, something like
POSIX named semaphores are the way forward.

Matthew.

#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Igor Kovalenko (#58)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

I was talking about named pipe vs UDS socket...

Aren't those the same thing? You get a socket file either way.

regards, tom lane

#60Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Matthew Kirkwood (#57)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

I was talking about named pipe vs UDS socket...

Aren't those the same thing? You get a socket file either way.

On QNX named pipe will have type 'FIFO file', which has similar features to
a socket indeed but implemented differently but that is not the point. On
SysV derivatives they all will be implemented as 2 connected STREAMS heads.
On BSD they both will be same thing. Not sure about other systems. The UDS
API however was originally limited to BSD4.3 and only later started to
spread, whereas named pipes have been around longer and probably exist in
any Unix variant and probably other types of systems.

-- igor

#61Matthew Kirkwood
matthew@hairy.beasts.org
In reply to: Igor Kovalenko (#58)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Wed, 8 May 2002, Igor Kovalenko wrote:

Can you be more specific? What required syntax? I was talking about
named pipe vs UDS socket...

Sorry, I meant semantics.

A pipe can have multiple readers and multiple writers. This is
no use for us.

A listening SOCK_STREAM Unix domain socket can have no readers or
writers, but only one listener (well, except that other processes
can inherit or be passed the socket). You have to connect() (and
the server must accept()) before read and write do anything. But
we have no use for that here. It's just an exclusive-only mutex
whose namespace is the filesystem.

It really is like a TCP socket, except that the address namespace
is the filesystem, and thus it's not available remotely.

Think of it as a TCP socket without the "which address and port
do I use, and how do I keep it secure" issues.

Matthew.

#62Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Matthew Kirkwood (#61)
Re: HEADS UP: Win32/OS2/BeOS native ports

Ahh... you want a named semaphore... There is such a thing in POSIX but it
is only portable if their names begin with "/" (which tells OS to put it
where appropriate). I believe without leading slash they end up in current
directory, but we can't rely on that... too bad. Glad UDS it is getting
supported on my platform, lol ;)

This will however leave QNX4 in the dust, if anyone cares. And most likely
BeOS, MP/X and half dozen other platforms. Which prompts me to think if it
would not be better to come up with a platform independent 'namespace sync'
mechanism. Can't we use fcntl()-based lock for that purpose? That's what
apache is doing apparently (one of variants).

-- igor

Show quoted text

On Wed, 8 May 2002, Igor Kovalenko wrote:

Can you be more specific? What required syntax? I was talking about
named pipe vs UDS socket...

Sorry, I meant semantics.

A pipe can have multiple readers and multiple writers. This is
no use for us.

A listening SOCK_STREAM Unix domain socket can have no readers or
writers, but only one listener (well, except that other processes
can inherit or be passed the socket). You have to connect() (and
the server must accept()) before read and write do anything. But
we have no use for that here. It's just an exclusive-only mutex
whose namespace is the filesystem.

It really is like a TCP socket, except that the address namespace
is the filesystem, and thus it's not available remotely.

Think of it as a TCP socket without the "which address and port
do I use, and how do I keep it secure" issues.

Matthew.

#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Igor Kovalenko (#62)
Re: HEADS UP: Win32/OS2/BeOS native ports

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

Can't we use fcntl()-based lock for that purpose?

I'm pretty sure that fcntl locking has an evil reputation as well.
(Didn't we use that up till a couple years ago, and give up on it?)

regards, tom lane

#64Jan Wieck
janwieck@yahoo.com
In reply to: Tom Lane (#59)
Re: HEADS UP: Win32/OS2/BeOS native ports

Tom Lane wrote:

"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:

I was talking about named pipe vs UDS socket...

Aren't those the same thing? You get a socket file either way.

No they are not. The former is a FIFO file, the latter a
socket. FIFO's can be used via open(2), sockets via
connect(2). And as said before, FIFO's are there since UNIX
Version 7 (at least, I haven't been around before that). So
there is a good chance that these are available on every
UNIX.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#65Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Igor Kovalenko (#21)
Re: HEADS UP: Win32/OS2/BeOS native ports

Igor Kovalenko wrote:

It does not have to be anonymous. POSIX also defines shm_open(same arguments
as open) API which will create named object in whatever location corresponds
to shared memory storage on that platform (object is then grown to needed
size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to shm_open()
with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).

Actually, I think the best shared memory implemention would be
MAP_ANON | MAP_SHARED mmap(), which could be called from the postmaster
and passed to child processes.

While all our platforms have mmap(), many don't have MAP_ANON, but those
that do could use it. You need MAP_ANON to prevent the shared memory
from being written to a disk file.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#66Bruce Momjian
pgman@candle.pha.pa.us
In reply to: mlw (#20)
Re: HEADS UP: Win32/OS2/BeOS native ports

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.

That being said, a SysV IPC interface for native Windows would be kind of cool
to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#67coventry
coventry@one.net
In reply to: Bruce Momjian (#66)
Re: HEADS UP: Win32/OS2/BeOS native ports

I think its already been determined that the cygwin option is too low
performing.

However, the apache stuff could be quite useful - but if that effort
were to be undertaken, it would make more sense to move all versions of the
code the
the apache runtime, for all platforms. Are there any other runtime
libraries out there
that are cross platform, open/free and high performance? I know the mozilla
XPCOM
libraries work quite nicely, but are geared more towards multithreaded
apps - and the
COM-alike infrastructure is something we wouldn't need.

~Jon

----- Original Message -----
From: Bruce Momjian <pgman@candle.pha.pa.us>
To: mlw <markw@mohawksoft.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; Marc G. Fournier <scrappy@hub.org>;
<pgsql-hackers@postgresql.org>
Sent: Sunday, June 02, 2002 8:49 PM
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll

write it

for Windows.

That being said, a SysV IPC interface for native Windows would be kind

of cool

Show quoted text

to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#68Marc G. Fournier
scrappy@hub.org
In reply to: Bruce Momjian (#66)
Re: HEADS UP: Win32/OS2/BeOS native ports

You might want to go to the archives and catch up on the whole thread and
its digressions :)

On Sun, 2 Jun 2002, Bruce Momjian wrote:

Show quoted text

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.

That being said, a SysV IPC interface for native Windows would be kind of cool
to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#69mlw
markw@mohawksoft.com
In reply to: Bruce Momjian (#66)
Re: HEADS UP: Win32/OS2/BeOS native ports

Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.

That being said, a SysV IPC interface for native Windows would be kind of cool
to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

I have not been participating on the list, I don't know why I'm still receiving
mail.

but! in the course of testing some code, I managed to gain some experience with
cygwin. I have seen fork() problems with a large number of processes.

For PostgreSQL to be as good on Windows as it is on UNIX, it has to be a native
program without cygwin. The shared memory and semaphore management should be
done with the postmaster process.

The apache stuff is OK, it is just as good as anything else. You may be able to
use critical sections in shared memory to implement a fast semaphore, but that
would take a bit experimentation.

I think what Tom had in mind is to take out the SysV and various OS specific
APIs and replace them with a more generic one, behind which, you guys can tune
the implementation.

#70Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Marc G. Fournier (#68)
Re: HEADS UP: Win32/OS2/BeOS native ports

Yes, I am having trouble figuring out if I have seen the whole thread yet.

---------------------------------------------------------------------------

Marc G. Fournier wrote:

You might want to go to the archives and catch up on the whole thread and
its digressions :)

On Sun, 2 Jun 2002, Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.

That being said, a SysV IPC interface for native Windows would be kind of cool
to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#71Jason Tishler
jason@tishler.net
In reply to: Bruce Momjian (#66)
Re: HEADS UP: Win32/OS2/BeOS native ports

Bruce,

On Sun, Jun 02, 2002 at 08:49:21PM -0400, Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.

That being said, a SysV IPC interface for native Windows would be kind of
cool to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

Are you referring to cygipc above? If so, they even one of the original
cygipc authors would discourage this:

http://sources.redhat.com/ml/cygwin-apps/2001-09/msg00017.html

Specifically, Ludovic Lange states the following:

I really think the solution would be to start again from scratch
another implementation, as was suggested. The way we did it was
quick and dirty, the goals weren't to have production systems
running on it but only to run prototypes. So the internal design
(if there is any) may not be adequate for the cygwin project.

However, Rob Collins has contributed a MinGW daemon to Cygwin to support
switching users, System V IPC, etc. So, this code base may be a more
suitable starting point to satisfy PostgreSQL's native Win32 System V
IPC needs.

Jason

#72Jason Tishler
jason@tishler.net
In reply to: mlw (#69)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:

Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write
it for Windows.

That being said, a SysV IPC interface for native Windows would be kind of
cool to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.

Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.

Jason

#73mlw
markw@mohawksoft.com
In reply to: Bruce Momjian (#66)
Re: HEADS UP: Win32/OS2/BeOS native ports

Jason Tishler wrote:

On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:

Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write
it for Windows.

That being said, a SysV IPC interface for native Windows would be kind of
cool to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.

Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.

Why would not PostgreSQL be affected by this?

#74Jan Wieck
janwieck@yahoo.com
In reply to: Jason Tishler (#72)
Re: HEADS UP: Win32/OS2/BeOS native ports

Jason Tishler wrote:

On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:

Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll write
it for Windows.

That being said, a SysV IPC interface for native Windows would be kind of
cool to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.

Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.

Whatever technical problems there are, we can debate on and
on if it's worth working around them in PostgreSQL or fixing
them in CygWIN or whatever.

The main problem will remain. That using PostgreSQL under
CygWIN requires some UNIX know how. So a pure Windows
user/shop needs UNIX knowledge to run our "Windows port" of
PostgreSQL? Interesting definition of "port".

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#75Robert Schrem
robert.schrem@WiredMinds.de
In reply to: Jason Tishler (#71)
Re: HEADS UP: Win32/OS2/BeOS native ports - the 'BEST OPEN SOURCE database backend'

Hi,

You may want to have a look at: http://www.garret.ru/~knizhnik/
You find there code for a 'Fast synchronized access to shared
memory for Windows and for i86 Unix-es".

kind regards,

Robert

Show quoted text

Bruce,

On Sun, Jun 02, 2002 at 08:49:21PM -0400, Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll
write it for Windows.

That being said, a SysV IPC interface for native Windows would be kind
of cool to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

Are you referring to cygipc above? If so, they even one of the original
cygipc authors would discourage this:

http://sources.redhat.com/ml/cygwin-apps/2001-09/msg00017.html

Specifically, Ludovic Lange states the following:

I really think the solution would be to start again from scratch
another implementation, as was suggested. The way we did it was
quick and dirty, the goals weren't to have production systems
running on it but only to run prototypes. So the internal design
(if there is any) may not be adequate for the cygwin project.

However, Rob Collins has contributed a MinGW daemon to Cygwin to support
switching users, System V IPC, etc. So, this code base may be a more
suitable starting point to satisfy PostgreSQL's native Win32 System V
IPC needs.

Jason

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#76Robert Schrem
robert.schrem@WiredMinds.de
In reply to: Jason Tishler (#71)
GOODS - a sensational public domain database backend that deserves a SQL frontend

Hi,

Some of you might already know GOODS, programmed
almost entirely by Konstantin Knizhnik - if not you should
really have a look at it right now (be warned: consuming this
extraordinary work might change your levels about the
required quality of a 'good programmer' forever. At least
this happend to me... ;):
http://www.garret.ru/~knizhnik/goods.html

Some core features of this backend (as they come to my mind):
-> full ACID transaction support
-> distributed stoarge management (->distributed transactions)
-> multible reader/single writer (is this called MVCC within PostgreSQL?)
-> dual client side object cache
-> online backup (snapshot backup AND permanent backup)
-> nested transactions on object level
-> transaction isolation levels on object level
-> object level shared and exclusive locks
-> excellent C++ programming interface
-> WAL
-> garbage collection for no longer reference database objects
-> fully thread safe client interface
-> JAVA client API
-> very high performance as a result of a lot of fine tuning
-> asyncrous event notification on object instance modification
-> extremly high code quality
-> a one person effort, hence a very clean design
-> the most relevant platforms are supported out of the box
-> complete build is done in less than a minute on my machine
-> it's documented
...

The licensing of this coding wonder: >>> PUBLIC DOMAIN <<<

I'm using GOODS quiet a while now in the context of my
development activities for a native XML database and have
very promissing experiences concerning performance and
stability of GOODS. E.g.: The performance seems to be
better than sleepycat's berkeley db library - especially
with mutliple simultanous transactions...

Maybe the only restriction to use this backend in postgres
from now on: it's completely C++ ...

I'm wondering why there is no SQL frontend yet for this
execellent backend...

You may want to look also at a comparision chart of some
other backends than GOODS (some of them from the same
author!!! I'm wondering how he was able to code all this...):
http://www.garret.ru/~knizhnik/compare.html

kind regards,

Robert

#77Jason Tishler
jason@tishler.net
In reply to: mlw (#73)
Re: HEADS UP: Win32/OS2/BeOS native ports

On Mon, Jun 03, 2002 at 09:36:51AM -0400, mlw wrote:

Jason Tishler wrote:

On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:

Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll
write it for Windows.

That being said, a SysV IPC interface for native Windows would be
kind of cool to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.

Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.

Why would not PostgreSQL be affected by this?

Sorry, if I was unclear -- I should have used two paragraphs above and
maybe a few more words... :,)

Cygwin PostgreSQL *is* affected by the Cygwin 63 children per parent
fork limitation.

PostgreSQL *can* be affected by the Cygwin DLL base address conflict
fork issue, but in my experience (both personal and by monitoring the
Cygwin and pgsql-cygwin lists), no one has been affected yet. The DLL
base address conflict is a "probability" thing. The more DLLs loaded
the greater the chance of a conflict (and fork() failing). Since, Cygwin
PostgreSQL loads only a few DLLs, this has not become an issue (yet).

Jason

#78Oleg Bartunov
oleg@sai.msu.su
In reply to: Robert Schrem (#76)
Re: GOODS - a sensational public domain database backend

Kostya is a good qualified programmer. I know him and he is always open for
challenges. Some time ago, me and Teodor ask him about GiST support
in his another database (Gigabase). It was sort of challenge ( we wanted
to port our contrib/tsearch module ) and he did that (using libgist).
We work with gigabase database embedded into our application under
Windows (we had a lot of troubles with perforance of postgresql under
Cygwin:-) and quite happy.

On Mon, 3 Jun 2002, Robert Schrem wrote:

Hi,

Some of you might already know GOODS, programmed
almost entirely by Konstantin Knizhnik - if not you should
really have a look at it right now (be warned: consuming this
extraordinary work might change your levels about the
required quality of a 'good programmer' forever. At least
this happend to me... ;):
http://www.garret.ru/~knizhnik/goods.html

Some core features of this backend (as they come to my mind):
-> full ACID transaction support
-> distributed stoarge management (->distributed transactions)
-> multible reader/single writer (is this called MVCC within PostgreSQL?)
-> dual client side object cache
-> online backup (snapshot backup AND permanent backup)
-> nested transactions on object level
-> transaction isolation levels on object level
-> object level shared and exclusive locks
-> excellent C++ programming interface
-> WAL
-> garbage collection for no longer reference database objects
-> fully thread safe client interface
-> JAVA client API
-> very high performance as a result of a lot of fine tuning
-> asyncrous event notification on object instance modification
-> extremly high code quality
-> a one person effort, hence a very clean design
-> the most relevant platforms are supported out of the box
-> complete build is done in less than a minute on my machine
-> it's documented
...

The licensing of this coding wonder: >>> PUBLIC DOMAIN <<<

I'm using GOODS quiet a while now in the context of my
development activities for a native XML database and have
very promissing experiences concerning performance and
stability of GOODS. E.g.: The performance seems to be
better than sleepycat's berkeley db library - especially
with mutliple simultanous transactions...

Maybe the only restriction to use this backend in postgres
from now on: it's completely C++ ...

I'm wondering why there is no SQL frontend yet for this
execellent backend...

You may want to look also at a comparision chart of some
other backends than GOODS (some of them from the same
author!!! I'm wondering how he was able to code all this...):
http://www.garret.ru/~knizhnik/compare.html

kind regards,

Robert

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#79mlw
markw@mohawksoft.com
In reply to: Bruce Momjian (#66)
Re: HEADS UP: Win32/OS2/BeOS native ports

Jason Tishler wrote:

On Mon, Jun 03, 2002 at 09:36:51AM -0400, mlw wrote:

Jason Tishler wrote:

On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:

Bruce Momjian wrote:

mlw wrote:

Like I told Marc, I don't care. You spec out what you want and I'll
write it for Windows.

That being said, a SysV IPC interface for native Windows would be
kind of cool to have.

I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.

but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.

Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.

Why would not PostgreSQL be affected by this?

Sorry, if I was unclear -- I should have used two paragraphs above and
maybe a few more words... :,)

Cygwin PostgreSQL *is* affected by the Cygwin 63 children per parent
fork limitation.

PostgreSQL *can* be affected by the Cygwin DLL base address conflict
fork issue, but in my experience (both personal and by monitoring the
Cygwin and pgsql-cygwin lists), no one has been affected yet. The DLL
base address conflict is a "probability" thing. The more DLLs loaded
the greater the chance of a conflict (and fork() failing). Since, Cygwin
PostgreSQL loads only a few DLLs, this has not become an issue (yet).

I'm not sure the DLL load address is a big issue for PostgreSQL, AFAIK no
option DLLs will be loaded by Postmaster. So, with fork() it will be a simple
process. A PostgreSQL child will die upon completion, and never execute fork().

My concern would be the limit on the number of child processes allowed. 63 is
far below what would be considered a usable number in production, and as long
as that is an issue, I don't think anyone would take PostgreSQL seriously.

A Windows version of PostgreSQL must run within the confines of the Windows OS.
The reason, IMHO, that no one has found any serious bugs in the cygwin version,
is because no one is seriously using it. Anyone who *would* seriously use it,
knows better.

#80Igor Kovalenko
Igor.Kovalenko@motorola.com
In reply to: Bruce Momjian (#65)
Re: HEADS UP: Win32/OS2/BeOS native ports

That's what Apache does. Note, on most platforms MAP_ANON is equivalent to
mmmap-ing /dev/zero. Solaris for example does not provide MAP_ANON but using

fd=open(/dev/zero)
mmap(fd, ...)
close(fd)

works just fine.

----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Igor Kovalenko" <Igor.Kovalenko@motorola.com>
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; "mlw" <markw@mohawksoft.com>; "Marc G.
Fournier" <scrappy@hub.org>; <pgsql-hackers@postgresql.org>
Sent: Sunday, June 02, 2002 7:47 PM
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports

Igor Kovalenko wrote:

It does not have to be anonymous. POSIX also defines shm_open(same

arguments

as open) API which will create named object in whatever location

corresponds

to shared memory storage on that platform (object is then grown to

needed

size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to

shm_open()

Show quoted text

with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).

Actually, I think the best shared memory implemention would be
MAP_ANON | MAP_SHARED mmap(), which could be called from the postmaster
and passed to child processes.

While all our platforms have mmap(), many don't have MAP_ANON, but those
that do could use it. You need MAP_ANON to prevent the shared memory
from being written to a disk file.

--
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026