HEADS UP: Win32/OS2/BeOS native ports
Morning all ...
Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...
The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...
"Marc G. Fournier" wrote:
Morning all ...
Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...
If you want any assistance, drop me an email. I spent a long time (> decade)
doing Windows applications and drivers and know a good number of the cool
tricks.
On Fri, 3 May 2002, mlw wrote:
"Marc G. Fournier" wrote:
Morning all ...
Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...If you want any assistance, drop me an email. I spent a long time (> decade)
doing Windows applications and drivers and know a good number of the cool
tricks.
hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...
but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)
Will there really be a need for a BeOS development with the sale of Be to
Palm? Is BeOS even still available? It might not be worth the time to
develop for BeOS until you see what Palm decides to do with the software.
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Marc G. Fournier
Sent: Friday, May 03, 2002 9:48 AM
To: mlw
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports
On Fri, 3 May 2002, mlw wrote:
"Marc G. Fournier" wrote:
Morning all ...
Just a heads up that over the next little while, I'm planning
on
making a bunch of commits in order to work on making the code able to
work
natively in the above environments ... my work will mostly focus on
Win32
(since I have no OS2/BeOS installs), but alot of the changes will be
such
that it will benefit the others as well ...
The initial changes will be to just wrapper all our shared
memory
code, so that I can make use of Apache's libapr libraries *if* they
are
installed ... if not, it will just fall back to "the current code" ...
If you want any assistance, drop me an email. I spent a long time (>
decade)
doing Windows applications and drivers and know a good number of the
cool
tricks.
hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...
but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
On Fri, 3 May 2002, Travis Hoyt wrote:
Will there really be a need for a BeOS development with the sale of Be to
Palm? Is BeOS even still available? It might not be worth the time to
develop for BeOS until you see what Palm decides to do with the software.
Note that the changes I'm making are to make use of what is available
through the libapr API that the Apache group has developed ... so, as long
as they have the hooks in for BeOS, we will ... doesn't mean PgSQL will
actually have makefiles for, and will compile under it, unless someone
*with* BeOS steps forward, but alot of the core functionality that has
held back native ports should work ...
Show quoted text
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Marc G. Fournier
Sent: Friday, May 03, 2002 9:48 AM
To: mlw
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native portsOn Fri, 3 May 2002, mlw wrote:
"Marc G. Fournier" wrote:
Morning all ...
Just a heads up that over the next little while, I'm planning
on
making a bunch of commits in order to work on making the code able to
work
natively in the above environments ... my work will mostly focus on
Win32
(since I have no OS2/BeOS installs), but alot of the changes will be
such
that it will benefit the others as well ...
The initial changes will be to just wrapper all our shared
memory
code, so that I can make use of Apache's libapr libraries *if* they
are
installed ... if not, it will just fall back to "the current code" ...
If you want any assistance, drop me an email. I spent a long time (>
decade)
doing Windows applications and drivers and know a good number of the
cool
tricks.
hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
"Marc G. Fournier" <scrappy@hub.org> writes:
The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...
I think we should redesign the shared memory API (and even more so the
semaphore API), not just put a wrapper layer on it. A lot of the
internal API is unnecessarily dependent on SysV shmem/sem behavior.
Note however that there are some things you will break if you are not
very careful. We are depending on shmem/sem behavior to catch a number
of multiple-postmaster conflict situations. If there's not a more or
less SysV-ish kernel underneath us, those situations will have to be
rethought and some other interlock invented.
In short, I want to see a design review first, not a bunch of
off-the-cuff commits.
regards, tom lane
Hi Marc,
How about using Dev-C++?
It's a Windows IDE with a GCC backend, and has a nice rep (and a Linux
port):
http://sourceforge.net/projects/dev-cpp/
It's always in SF.net's "Top 10" most worked on projects too, with about
roughly 7,000 downloads per day. It can generate mingwin code too.
:-)
Regards and best wishes,
Justin Clift
"Marc G. Fournier" wrote:
On Fri, 3 May 2002, mlw wrote:
"Marc G. Fournier" wrote:
Morning all ...
Just a heads up that over the next little while, I'm planning on
making a bunch of commits in order to work on making the code able to work
natively in the above environments ... my work will mostly focus on Win32
(since I have no OS2/BeOS installs), but alot of the changes will be such
that it will benefit the others as well ...The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...If you want any assistance, drop me an email. I spent a long time (> decade)
doing Windows applications and drivers and know a good number of the cool
tricks.hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi
On Fri, 3 May 2002, Tom Lane wrote:
"Marc G. Fournier" <scrappy@hub.org> writes:
The initial changes will be to just wrapper all our shared memory
code, so that I can make use of Apache's libapr libraries *if* they are
installed ... if not, it will just fall back to "the current code" ...I think we should redesign the shared memory API (and even more so the
semaphore API), not just put a wrapper layer on it. A lot of the
internal API is unnecessarily dependent on SysV shmem/sem behavior.Note however that there are some things you will break if you are not
very careful. We are depending on shmem/sem behavior to catch a number
of multiple-postmaster conflict situations. If there's not a more or
less SysV-ish kernel underneath us, those situations will have to be
rethought and some other interlock invented.In short, I want to see a design review first, not a bunch of
off-the-cuff commits.
All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...
Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...
"Marc G. Fournier" <scrappy@hub.org> writes:
All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...
Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...
Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.
regards, tom lane
On Fri, 3 May 2002, Tom Lane wrote:
"Marc G. Fournier" <scrappy@hub.org> writes:
All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.
Will investigate this ... my immediate goal is to just get it so that an
alternate library can be used ... default behaviour will be to stick with
our current function calls ... to use libapr, you will/would have to use a
configure option for it (sorry, meant --enable above, not --disable) ...
The only '#ifdef's I'm planning on for this will be in a central shmem.*
file(s), so there isn't going to be a string of those all over the place
or anything stupid like that ...
Tom Lane wrote:
"Marc G. Fournier" <scrappy@hub.org> writes:
All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.
I am not familiar with the Apache code, but I see no reason why all the
features in SysV SHM should not be implementable in a Windows modules. IMHO
that's what should be done.
"Marc G. Fournier" wrote:
On Fri, 3 May 2002, Tom Lane wrote:
"Marc G. Fournier" <scrappy@hub.org> writes:
All I'm planning on doing is changing the appropriate shm_* functions iwth
pg_shm_* functions ... if !(libapr), all those pg_shm_* functions will
have in them is the original call we've always used ... there will even be
a --disable-libapr configure option so that if someone already has Apache2
installed, but doesn't wanna use libapr for PgSQL, they don't have to ...Basically, all I'm looking at is allowing PgSQL to use a different library
for its shared memory calls then the standard one, nothing else ...Oh. I guess my next question is how closely that Apache library
emulates the SysV shmem semantics. In particular, can you reliably
tell how many processes are attached to a shmem block? (Cf
SharedMemoryIsInUse() in storage/ipc/ipc.c) Without that feature we
have an interlock problem.Will investigate this ... my immediate goal is to just get it so that an
alternate library can be used ... default behaviour will be to stick with
our current function calls ... to use libapr, you will/would have to use a
configure option for it (sorry, meant --enable above, not --disable) ...The only '#ifdef's I'm planning on for this will be in a central shmem.*
file(s), so there isn't going to be a string of those all over the place
or anything stupid like that ...
I think that you should create a verbatim implementation of the SysV shared
memory API in native Win32. It may have to be a pgsysvshm.dll or something like
it, but I think it is the best possible approach.
Let me look at it, I may be able to have something pretty quick.
Tom Lane wrote:
mlw <markw@mohawksoft.com> writes:
I think that you should create a verbatim implementation of the SysV
shared memory API in native Win32. It may have to be a pgsysvshm.dll
or something like it, but I think it is the best possible approach.Let me look at it, I may be able to have something pretty quick.
The notion of redesigning the internal API shouldn't be forgotten,
though. I'm not so dissatisfied with the shmem API (mainly because
it's only relevant at startup; once we've created and attached the
shmem segment, we're done worrying about it). But the SysV semaphore
API is really kind of ugly, and the ugliness doesn't buy anything except
porting difficulty. Moreover, putting a cleaner API layer there would
make it easier to experiment with cheaper semaphore primitives, such
as POSIX mutexes.There was a thread last fall concerning redesigning that code --- I've
forgotten the guy's name, but IIRC he wanted to make a port to QNX6,
and the sema code was getting in the way. We put the work on hold
because we were getting close to 7.2 release (or thought we were,
anyway) but the project ought to be taken up again.
I will commit to writing a windows version of what ever shm/semaphore/mutex
code you guys specify.
Show quoted text
regards, tom lane
mlw <markw@mohawksoft.com> writes:
I think that you should create a verbatim implementation of the SysV
shared memory API in native Win32. It may have to be a pgsysvshm.dll
or something like it, but I think it is the best possible approach.
Let me look at it, I may be able to have something pretty quick.
The notion of redesigning the internal API shouldn't be forgotten,
though. I'm not so dissatisfied with the shmem API (mainly because
it's only relevant at startup; once we've created and attached the
shmem segment, we're done worrying about it). But the SysV semaphore
API is really kind of ugly, and the ugliness doesn't buy anything except
porting difficulty. Moreover, putting a cleaner API layer there would
make it easier to experiment with cheaper semaphore primitives, such
as POSIX mutexes.
There was a thread last fall concerning redesigning that code --- I've
forgotten the guy's name, but IIRC he wanted to make a port to QNX6,
and the sema code was getting in the way. We put the work on hold
because we were getting close to 7.2 release (or thought we were,
anyway) but the project ought to be taken up again.
regards, tom lane
sysv shm/sem
I am writing a Win32 DLL implementation of :
int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);
int shmctl(int shmid, int cmd, struct shmid_ds *buf);
int shmget(key_t key, int size, int shmflg);
void * shmat(int shmid, const void *shmaddr, int shmfl);
int shmdt(const void *shmaddr);
I will donate it do PostgreSQL.
UNIX permissions will be ignored, i.e. uig/gid will be 0
Do you see any need for the msgxxx calls?
Is the function ipc() ever used?
mlw <markw@mohawksoft.com> writes:
UNIX permissions will be ignored, i.e. uig/gid will be 0
Win32 has no security anyway, right? ;-)
Do you see any need for the msgxxx calls?
Is the function ipc() ever used?
Nope, and nope.
regards, tom lane
mlw <markw@mohawksoft.com> writes:
I think that you should create a verbatim implementation of the SysV
shared memory API in native Win32. It may have to be a pgsysvshm.dll
or something like it, but I think it is the best possible approach.Let me look at it, I may be able to have something pretty quick.
The notion of redesigning the internal API shouldn't be forgotten,
though. I'm not so dissatisfied with the shmem API (mainly because
it's only relevant at startup; once we've created and attached the
shmem segment, we're done worrying about it). But the SysV semaphore
API is really kind of ugly, and the ugliness doesn't buy anything except
porting difficulty. Moreover, putting a cleaner API layer there would
make it easier to experiment with cheaper semaphore primitives, such
as POSIX mutexes.There was a thread last fall concerning redesigning that code --- I've
forgotten the guy's name, but IIRC he wanted to make a port to QNX6,
That would be me.
and the sema code was getting in the way. We put the work on hold
because we were getting close to 7.2 release (or thought we were,
anyway) but the project ought to be taken up again.
Yes, I am intended to give it another spin soon. I think it is bad idea to
impose SysV ugliness on systems which have better solutions. Main problem
with SysV primitives is that they are 'sticky' (i.e., not cleaned up if
process dies/exits by the system). So Postgres has to deal with issues like
discovering leftovers, finding unused IPC keys, etc. It is inelegant and
takes up lot of code. POSIX primitives are anonymous and cleaned up
automatically. So you just say 'give me a semaphore' and you get it, nothing
gets into your way.
Performance of POSIX mutexes and semaphores (on platforms where they are
implemented properly) is also better than SysV semaphores. Unfortunately
some systems have rather lame POSIX support, for example semaphores and
mutexes can't be shared across processes on Linux. That's basically the
reason why people keep sticking to SysV.
What really need to be done is new abstraction layer which would cover SysV
API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I almost
did it last time...
-- igor
mlw <markw@mohawksoft.com> writes:
I am writing a Win32 DLL implementation of :
int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);
Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)
After looking over the uses of these functions, I believe that we could
easily develop a non-SysV-centric internal API. Here's a first cut:
1. Define a struct type PGSemaphore that has implementation-specific
contents (the generic code will never look inside it). Operations on
semaphores will take "PGSemaphore *" arguments. When implementing atop
SysV semaphores, PGSemaphore will contain two fields, the semaphore id
and semaphore number. In other cases the contents could be different.
2. All PGSemaphore structs will be physically stored in shared memory.
This doesn't matter for SysV support, where the id/number are constants
anyway; but it will allow implementations based on mutexes.
3. The operations needed are
* Reserve semaphores. This will be told the number of semaphores
needed. On SysV it will do the necessary semget()s, but on some
implementations it might be a no-op. This should also be prepared
to clean up after a failed postmaster, if it is possible for sema
resources to outlive the creating postmaster.
* Create semaphore. Given a pointer to an uninitialized PGSemaphore
struct, initialize it to a new semaphore with count 1. (On SysV this
would hand out the individual semas previously allocated by Reserve.)
Note that this is not responsible for allocating the memory occupied
by the PGSemaphore struct --- I envision the structs being part of
larger objects such as PROC structures.
* Release semaphores. Release all resources allocated by previous
Reserve and Create operations. This is called when shutting down
or when resetting shared memory after a backend crash.
* Reset semaphore. Reset an existing PGSemaphore to count zero.
* Lock semaphore. Identical to current IpcSemaphoreLock(), except
parameter is a PGSemaphore *. See code of that routine for detailed
semantics.
* Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except
parameter is a PGSemaphore *.
* Conditional lock semaphore. Identical to current
IpcSemaphoreTryLock(), except parameter is a PGSemaphore *.
Reserve/create/release would all be called in the postmaster process,
so they could communicate via malloc'd private memory (eg, an array
of semaphore IDs would be needed in the SysV case). The remaining
operations would be invokable by any backend.
Comments?
I'd be willing to work on refactoring the existing SysV-based code
to meet this spec.
regards, tom lane
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
What really need to be done is new abstraction layer which would cover SysV
API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I almost
did it last time...
Yes. I just sent off a proposal for a cleaner semaphore API --- please
comment on it.
My inclination is to stick with the SysV API for shared memory, however.
The "stickiness" is actually not a bad thing for us in the shared memory
case, because it allows a new postmaster to detect the situation where
old backends are still running: it can see that there is an old shmem
segment still present with attached processes. Without that, we have no
good defense against the scenario where an old postmaster dumped core
leaving backends still running. The backends are fine as long as they
are left to finish out their operations, or even killed with whatever
degree of prejudice the admin wants. But what we must *not* do is allow
a new postmaster to start while the old backends are still running;
that would mean two sets of backends running without contact with each
other, which would be fatal for data integrity. The SysV API lets us
detect that case, but I don't see any equally good way to do it if we
are using anonymous shared memory.
regards, tom lane
Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.
That being said, a SysV IPC interface for native Windows would be kind of cool
to have.
Tom Lane wrote:
Show quoted text
mlw <markw@mohawksoft.com> writes:
I am writing a Win32 DLL implementation of :
int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)After looking over the uses of these functions, I believe that we could
easily develop a non-SysV-centric internal API. Here's a first cut:1. Define a struct type PGSemaphore that has implementation-specific
contents (the generic code will never look inside it). Operations on
semaphores will take "PGSemaphore *" arguments. When implementing atop
SysV semaphores, PGSemaphore will contain two fields, the semaphore id
and semaphore number. In other cases the contents could be different.2. All PGSemaphore structs will be physically stored in shared memory.
This doesn't matter for SysV support, where the id/number are constants
anyway; but it will allow implementations based on mutexes.3. The operations needed are
* Reserve semaphores. This will be told the number of semaphores
needed. On SysV it will do the necessary semget()s, but on some
implementations it might be a no-op. This should also be prepared
to clean up after a failed postmaster, if it is possible for sema
resources to outlive the creating postmaster.* Create semaphore. Given a pointer to an uninitialized PGSemaphore
struct, initialize it to a new semaphore with count 1. (On SysV this
would hand out the individual semas previously allocated by Reserve.)
Note that this is not responsible for allocating the memory occupied
by the PGSemaphore struct --- I envision the structs being part of
larger objects such as PROC structures.* Release semaphores. Release all resources allocated by previous
Reserve and Create operations. This is called when shutting down
or when resetting shared memory after a backend crash.* Reset semaphore. Reset an existing PGSemaphore to count zero.
* Lock semaphore. Identical to current IpcSemaphoreLock(), except
parameter is a PGSemaphore *. See code of that routine for detailed
semantics.* Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except
parameter is a PGSemaphore *.* Conditional lock semaphore. Identical to current
IpcSemaphoreTryLock(), except parameter is a PGSemaphore *.Reserve/create/release would all be called in the postmaster process,
so they could communicate via malloc'd private memory (eg, an array
of semaphore IDs would be needed in the SysV case). The remaining
operations would be invokable by any backend.Comments?
I'd be willing to work on refactoring the existing SysV-based code
to meet this spec.regards, tom lane
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
What really need to be done is new abstraction layer which would cover
SysV
API, POSIX and whatever native APIs are better for BeOS/OS2/Win32. I
almost
did it last time...
Yes. I just sent off a proposal for a cleaner semaphore API --- please
comment on it.
I will look. I remember from my last attempt that it actually did not
involve a lot of changes in your existing abstraction layer (which already
exists, just being SysV-centric). I believe only one function prototype had
to be changed... Your proposal sounds like more changes will be needed...
My inclination is to stick with the SysV API for shared memory, however.
The "stickiness" is actually not a bad thing for us in the shared memory
case, because it allows a new postmaster to detect the situation where
old backends are still running: it can see that there is an old shmem
segment still present with attached processes. Without that, we have no
good defense against the scenario where an old postmaster dumped core
leaving backends still running. The backends are fine as long as they
are left to finish out their operations, or even killed with whatever
degree of prejudice the admin wants. But what we must *not* do is allow
a new postmaster to start while the old backends are still running;
that would mean two sets of backends running without contact with each
other, which would be fatal for data integrity. The SysV API lets us
detect that case, but I don't see any equally good way to do it if we
are using anonymous shared memory.
It does not have to be anonymous. POSIX also defines shm_open(same arguments
as open) API which will create named object in whatever location corresponds
to shared memory storage on that platform (object is then grown to needed
size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to shm_open()
with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).
I suggest we do IPC abstraction which would cover shared memory as well as
semaphores, otherwise it will be only half of solution - platforms without
SysV API would still have to emulate SysV shared memory.
-- igor
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
It does not have to be anonymous. POSIX also defines shm_open(same arguments
as open) API which will create named object in whatever location corresponds
to shared memory storage on that platform (object is then grown to needed
size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to shm_open()
with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).
Yes, but can you detect whether other processes have the same file open?
regards, tom lane
On Fri, 3 May 2002, Tom Lane wrote:
But what we must *not* do is allow a new postmaster to start while the
old backends are still running; that would mean two sets of backends
running without contact with each other, which would be fatal for data
integrity. The SysV API lets us detect that case, but I don't see any
equally good way to do it if we are using anonymous shared memory.
It's a hack (and has slight security implications), but you
could just allow the postgres backends to keep the listening
socket(s) open.
Matthew.
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Friday, May 03, 2002 6:07 PM
To: mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native portsRather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)
Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?
- J.
"Joel Burton" <joel@joelburton.com> writes:
Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)
Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?
Was the problem just with semas, or was shmem an issue too?
In any case, unless someone actually writes an alternative sema
implementation that will work on BSD, nothing will happen...
regards, tom lane
Matthew Kirkwood <matthew@hairy.beasts.org> writes:
On Fri, 3 May 2002, Tom Lane wrote:
The SysV API lets us detect that case, but I don't see any
equally good way to do it if we are using anonymous shared memory.
It's a hack (and has slight security implications), but you
could just allow the postgres backends to keep the listening
socket(s) open.
Hmm. That might be workable, but it feels shaky to me. The problem
is that you are using a lock based on port number to interlock a data
directory --- and port number and data directory are independently
variable parameters. Consider
$ postmaster -D /my/dir &
-- dba thinks "oops, forgot to specify port"
$ kill -9 pm-pid # bad idea
$ postmaster -D /my/dir -p myport &
Any backends started by the first postmaster will not be noticed by
the second one, if the interlock is based on port number.
We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.
regards, tom lane
I have just committed changes to create a platform-independent internal
API for semaphores, along the lines discussed yesterday.
At this point, the Darwin (Mac OS X), BeOS, and QNX4 ports are probably
broken. I will fix the Darwin port (probably not till tomorrow though);
volunteers to clean up the BeOS and QNX4 ports are needed.
BTW, there is a quick hack attempt at a POSIX-semaphore-based
implementation in src/backend/port/posix_sema.c. I have not tested
this yet, but expect to do so as part of fixing the Darwin port.
regards, tom lane
"Joel Burton" <joel@joelburton.com> writes:
Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?Was the problem just with semas, or was shmem an issue too?
Not sure -- doesn't get far enough for me to tell. initdb dies with:
creating template1 database in /usr/local/pgsql/data/base/1...
IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed:
Function not implemented
In any case, unless someone actually writes an alternative sema
implementation that will work on BSD, nothing will happen...
Was hoping that the discussions about the APR might let this work under BSD
jails, assuming I can get the APR to compile.
(For others: apparently PG will work under BSD jails if you recompile the
BSD kernel w/some new settings, but my ISP for this project was unwilling to
do that. Search the mailing list for messages on how to do this.)
J.
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Friday, May 03, 2002 3:07 PM
To: mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native portsmlw <markw@mohawksoft.com> writes:
I am writing a Win32 DLL implementation of :
int semget(key_t key, int nsems, int semflg);
int semctl(int semid, int semnum, int cmd, union semun arg);
int semop(int semid, struct sembuf * sops, unsigned nsops);Rather than propagating the SysV semaphore API still further,
why don't
we kill it now? (I'm willing to keep the shmem API, however.)After looking over the uses of these functions, I believe
that we could
easily develop a non-SysV-centric internal API. Here's a first cut:1. Define a struct type PGSemaphore that has implementation-specific
contents (the generic code will never look inside it). Operations on
semaphores will take "PGSemaphore *" arguments. When
implementing atop
SysV semaphores, PGSemaphore will contain two fields, the semaphore id
and semaphore number. In other cases the contents could be different.2. All PGSemaphore structs will be physically stored in
shared memory.
This doesn't matter for SysV support, where the id/number are
constants
anyway; but it will allow implementations based on mutexes.3. The operations needed are
* Reserve semaphores. This will be told the number of semaphores
needed. On SysV it will do the necessary semget()s, but on some
implementations it might be a no-op. This should also be prepared
to clean up after a failed postmaster, if it is possible for sema
resources to outlive the creating postmaster.* Create semaphore. Given a pointer to an uninitialized PGSemaphore
struct, initialize it to a new semaphore with count 1. (On SysV this
would hand out the individual semas previously allocated by Reserve.)
Note that this is not responsible for allocating the memory occupied
by the PGSemaphore struct --- I envision the structs being part of
larger objects such as PROC structures.* Release semaphores. Release all resources allocated by previous
Reserve and Create operations. This is called when shutting down
or when resetting shared memory after a backend crash.* Reset semaphore. Reset an existing PGSemaphore to count zero.
* Lock semaphore. Identical to current IpcSemaphoreLock(), except
parameter is a PGSemaphore *. See code of that routine for detailed
semantics.* Unlock semaphore. Identical to current IpcSemaphoreUnlock(), except
parameter is a PGSemaphore *.* Conditional lock semaphore. Identical to current
IpcSemaphoreTryLock(), except parameter is a PGSemaphore *.Reserve/create/release would all be called in the postmaster process,
so they could communicate via malloc'd private memory (eg, an array
of semaphore IDs would be needed in the SysV case). The remaining
operations would be invokable by any backend.Comments?
I'd be willing to work on refactoring the existing SysV-based code
to meet this spec.
It's already been done. Here is a freely available C++ implementation
(licensing similar to PostgreSQL):
http://www.cs.wustl.edu/~schmidt/ACE.html
Import Notes
Resolved by subject fallback
"Joel Burton" <joel@joelburton.com> writes:
Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?Was the problem just with semas, or was shmem an issue too?
Not sure -- doesn't get far enough for me to tell. initdb dies with:
creating template1 database in /usr/local/pgsql/data/base/1...
IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed:
Function not implemented
We create shared memory before semaphores, so if you got this far then
the shmem code is probably working (at least minimally).
Do you have working sem_open or sem_init (ie, POSIX semaphores)?
regards, tom lane
Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?
I have postgresql working quite happily in FreeBSD jails! (Just make sure
you go "sysctl jail.sysvipc_allowed=1").
Chris
(For others: apparently PG will work under BSD jails if you recompile the
BSD kernel w/some new settings, but my ISP for this project was
unwilling to
do that. Search the mailing list for messages on how to do this.)
Works fine. You don't need to recompile - just use the sysctl.
Chris
Marc G. Fournier wrote:
hrmmmm ... do you have a working Windows development environment? I'm
running WinXP at home, but don't have any of the compilers or anything
yet, so all my work for the first part is going to be done under Unix ...but someone that knows something about building makefiles for Windows, and
compiling under it, will definitely be a major asset ;)
I think if you are familiar with make and gcc (and perhaps autoconf),
MinGW and MSys are the development environment of choice on Windows. You
even get /bin/sh. But the generated program does not depend on any
custom library (like cygwin does). It's even possible to cross compile
from a Linux box (actully powerpc in my case).
Look at http://mingw.sourceforge.net (and there for msys).
Christof
Rather than propagating the SysV semaphore API still further,
why don't
we kill it now? (I'm willing to keep the shmem API, however.)
Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?I have postgresql working quite happily in FreeBSD jails! (Just make sure
you go "sysctl jail.sysvipc_allowed=1").
Yep, Alastair D'Silva helpfully pointed this out a month or two ago, and for
many people, this would be a workable solution. Unfortunately, it appears
that you have to run this command outside the jail, which I don't have
access to.
I forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:
"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."
And therefore they refused to make the change. (More annoyingly, they kept
trying to convince me that I should quit my whining and use MySQL since it's
"ACID compliant").
So, I'm holding out hope that since this ISP seems unenlightened, one day
PostgreSQL will simply run in BSD jails without a cooperating jailmaster,
and it sounded like using the APR _might_ make this possible. (All of my
other projects use PG; I'd sure love to get this one switched over!)
Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant
I forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."
Not true. But I'll avoid digging up any more on that old issue...
Chris
On Sat, 4 May 2002, Joel Burton wrote:
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Friday, May 03, 2002 6:07 PM
To: mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native portsRather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?
There is no problem with SysV IPC in the jail, per se ... jail's were just
not coded to delimite/segregate such IPC from other jails ... its one of
those "caveat empor"(sp?) situations ... you can do it, but at your own
risk, as somoene in another jail has the ability to 'attach' to your
segments ...
-----Original Message-----
From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au]
Sent: Monday, May 06, 2002 7:36 AM
To: Joel Burton; Tom Lane; mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native portsI forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."Not true. But I'll avoid digging up any more on that old issue...
Oh, I'm sure it's not true. But sometimes things end up on the "nyah, nyah,
it's my server and I say so" level. Sigh.
So, I guess that's where it leaves me: waiting for some solution other than
ISP cluefulness. :-)
- J.
Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant
On Sat, 4 May 2002, Tom Lane wrote:
Matthew Kirkwood <matthew@hairy.beasts.org> writes:
On Fri, 3 May 2002, Tom Lane wrote:
The SysV API lets us detect that case, but I don't see any
equally good way to do it if we are using anonymous shared memory.It's a hack (and has slight security implications), but you
could just allow the postgres backends to keep the listening
socket(s) open.Hmm. That might be workable, but it feels shaky to me. The problem
is that you are using a lock based on port number to interlock a data
directory --- and port number and data directory are independently
variable parameters. Consider
$ postmaster -D /my/dir &
-- dba thinks "oops, forgot to specify port"
$ kill -9 pm-pid # bad idea
$ postmaster -D /my/dir -p myport &
Any backends started by the first postmaster will not be noticed by
the second one, if the interlock is based on port number.We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.
How about a second, data directory based socket simply named something
like '.inuse', that is not port dependent?
On Sun, 5 May 2002, Joel Burton wrote:
"Joel Burton" <joel@joelburton.com> writes:
Rather than propagating the SysV semaphore API still further, why don't
we kill it now? (I'm willing to keep the shmem API, however.)Would this have the benefit of allow PostgreSQL to work properly in BSD
jails, since lack of really working SysV IPC was the problem there?Was the problem just with semas, or was shmem an issue too?
Not sure -- doesn't get far enough for me to tell. initdb dies with:
creating template1 database in /usr/local/pgsql/data/base/1...
IpcSemaphoreCreate: semget(key=1, num=17, 03600) failed:
Function not implemented
Read the jail manpage:
jail.sysvipc_allowed
This MIB entry determines whether or not processes within a jail
have access to System V IPC primitives. In the current jail imple-
mentation, System V primitives share a single namespace across the
host and jail environments, meaning that processes within a jail
would be able to communicate with (and potentially interfere with)
processes outside of the jail, and in other jails. As such, this
functionality is disabled by default, but can be enabled by setting
this MIB entry to 1.
Or changing ISPs to a place more enlightened ...
On Mon, 6 May 2002, Joel Burton wrote:
Show quoted text
-----Original Message-----
From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au]
Sent: Monday, May 06, 2002 7:36 AM
To: Joel Burton; Tom Lane; mlw
Cc: Marc G. Fournier; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native portsI forwarded the suggestion to my ISP (imeme, a Zope provider), who said
that:"This will allow you to run a single postgres in a single jail only one
user would have access to it. If you try to run more then one it will
try to use the same shared memory and crash."Not true. But I'll avoid digging up any more on that old issue...
Oh, I'm sure it's not true. But sometimes things end up on the "nyah, nyah,
it's my server and I say so" level. Sigh.So, I guess that's where it leaves me: waiting for some solution other than
ISP cluefulness. :-)- J.
Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton
Knowledge Management & Technology Consultant
"Marc G. Fournier" <scrappy@hub.org> writes:
We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.
How about a second, data directory based socket simply named something
like '.inuse', that is not port dependent?
Hmm ... but how do you use that to tell if there are still backends
around?
regards, tom lane
On Mon, 6 May 2002, Tom Lane wrote:
"Marc G. Fournier" <scrappy@hub.org> writes:
We could get around this, of course: record the port number in the data
directory lockfile, and test for existence of the old socket
independently of trying to create a new one. But it seems ugly.How about a second, data directory based socket simply named something
like '.inuse', that is not port dependent?Hmm ... but how do you use that to tell if there are still backends
around?
As a backend is started up, connect to that socket ... if socket is open
when trying to start a new frontend, fail as there are currently other
connections attached to it?
"Marc G. Fournier" <scrappy@hub.org> writes:
Hmm ... but how do you use that to tell if there are still backends
around?
As a backend is started up, connect to that socket ... if socket is open
when trying to start a new frontend, fail as there are currently other
connections attached to it?
But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?
ISTM we gave up on exactly that technique for the main postmaster's
socket; we now create a separate lockfile to protect the socket, and
don't rely on the socket itself to give us any interlocking help at all.
But the lockfile just contains the postmaster's PID, so it's no help
in detecting the case where the old postmaster has gone away but there
are still orphaned backends laying about.
I'm not entirely thrilled with the lockfile technique; it'd be nice to
find something better. (In particular, we've seen a couple cases now
where people had trouble with PG refusing to start after a system
reboot, because some other daemon process had been assigned the PID
that the postmaster had in its previous incarnation; so the lockfile
check code mistakenly thinks there's still an old postmaster.) But
so far, the only thing worse than lockfiles is everything else :-(
regards, tom lane
I said:
But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?
Oh, I take that back, I see how you could do it: the postmaster opens
the socket *for writing*, but never actually writes. All its child
processes inherit that same open file descriptor and just keep it
around. Then, to tell if anyone's home, you open the socket *for
reading* and try to read in O_NONBLOCK mode. You get an EOF indication
if and only if no one has the socket open for writing; otherwise you
get an EAGAIN error.
That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.
regards, tom lane
On Mon, 6 May 2002, Tom Lane wrote:
I said:
But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?Oh, I take that back, I see how you could do it: the postmaster opens
the socket *for writing*, but never actually writes. All its child
processes inherit that same open file descriptor and just keep it
around. Then, to tell if anyone's home, you open the socket *for
reading* and try to read in O_NONBLOCK mode. You get an EOF indication
if and only if no one has the socket open for writing; otherwise you
get an EAGAIN error.That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.
Wouldn't the same thing work with a simple file? Does it have to be a
UnixDomainSocket?
"Marc G. Fournier" <scrappy@hub.org> writes:
That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.
Wouldn't the same thing work with a simple file? Does it have to be a
UnixDomainSocket?
No, and yes. If it's not a pipe/fifo then you don't get the
EOF-only-when-no-possible-writers-remain behavior. TCP and UDP
sockets don't show this sort of behavior either. So AFAICS we
really need a named pipe, ie, socket.
We could maybe do something approximately similar with TCP connection
attempts (per the prior suggestion of letting backends hold the
postmaster's listen socket open; then see if you get "connection
refused" or a timeout from trying to connect) but I don't think it'd be
as trustworthy. Simple mistakes like overly aggressive ipchains filters
would confuse this kind of test.
regards, tom lane
Since our default behavior (at startup) is to have TCP sockets disabled,
how many OSs are there that don't support UD sockets? Enough to really be
worried about?
On Mon, 6 May 2002, Tom Lane wrote:
Show quoted text
"Marc G. Fournier" <scrappy@hub.org> writes:
That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.Wouldn't the same thing work with a simple file? Does it have to be a
UnixDomainSocket?No, and yes. If it's not a pipe/fifo then you don't get the
EOF-only-when-no-possible-writers-remain behavior. TCP and UDP
sockets don't show this sort of behavior either. So AFAICS we
really need a named pipe, ie, socket.We could maybe do something approximately similar with TCP connection
attempts (per the prior suggestion of letting backends hold the
postmaster's listen socket open; then see if you get "connection
refused" or a timeout from trying to connect) but I don't think it'd be
as trustworthy. Simple mistakes like overly aggressive ipchains filters
would confuse this kind of test.regards, tom lane
"Marc G. Fournier" <scrappy@hub.org> writes:
Since our default behavior (at startup) is to have TCP sockets disabled,
how many OSs are there that don't support UD sockets?
A quick look in the sources shows that we #undef HAVE_UNIX_SOCKETS for
QNX, BeOS, and old cygwin versions ... which are exactly the platforms
that don't have SysV shmem support, so those are exactly the guys who
we're trying to fix the problem for.
I do like the idea of using a Unix socket this way where available,
though. It'd let us switch over the shmem code to using IPC_PRIVATE
shmem key, which'd simplify that code tremendously; and we could make
some progress against the dead-PID-in-lockfile problem.
Could we get away with saying that the Unix-socket-less platforms have
weaker protection against mistakenly restarting the postmaster? We
could have a plain-vanilla lockfile instead of a socket lockfile on
those platforms, which would not catch the dead-postmaster-live-backends
case, but it'd be better than nothing. And I am not convinced that the
shmem-connection-count check should be trusted on QNX or BeOS, anyway,
so I'm not sure that they actually have a functioning check now.
regards, tom lane
Tom Lane wrote:
I said:
But the backends would only have the socket open, they'd not be actively
listening to it. So how could you tell whether anyone had the socket
open or not?Oh, I take that back, I see how you could do it: the postmaster opens
the socket *for writing*, but never actually writes. All its child
processes inherit that same open file descriptor and just keep it
around. Then, to tell if anyone's home, you open the socket *for
reading* and try to read in O_NONBLOCK mode. You get an EOF indication
if and only if no one has the socket open for writing; otherwise you
get an EAGAIN error.That would work ... but is it more portable than depending on SysV
shmem connection counts? ISTR that some of the platforms we support
don't have Unix-style sockets at all.
I think what you describe is a named pipe, not a socket. The
underlying implementation might be a socketpair, but the
behaviour of named pipes is exactly that since Version 7 at
least. This worked under Minix already.
regards, tom lane
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
"Marc G. Fournier" <scrappy@hub.org> writes:
Since our default behavior (at startup) is to have TCP sockets disabled,
how many OSs are there that don't support UD sockets?A quick look in the sources shows that we #undef HAVE_UNIX_SOCKETS for
QNX, BeOS, and old cygwin versions ... which are exactly the platforms
that don't have SysV shmem support, so those are exactly the guys who
we're trying to fix the problem for.
Next release of QNX (6.2) will add support for UDS, but they are still not
quite portable.
I do like the idea of using a Unix socket this way where available,
though. It'd let us switch over the shmem code to using IPC_PRIVATE
shmem key, which'd simplify that code tremendously; and we could make
some progress against the dead-PID-in-lockfile problem.Could we get away with saying that the Unix-socket-less platforms have
weaker protection against mistakenly restarting the postmaster? We
could have a plain-vanilla lockfile instead of a socket lockfile on
those platforms, which would not catch the dead-postmaster-live-backends
case, but it'd be better than nothing. And I am not convinced that the
shmem-connection-count check should be trusted on QNX or BeOS, anyway,
so I'm not sure that they actually have a functioning check now.
Why can't we use named pipe (aka FIFO file) instead of UDS? I think that is
more portable... The socketpair() function also tends to be more portable
than whole UDS in general... It works on QNX4 even, but not sure about BeOS.
Another thought is, why can't we use bind() to the postmaster port to detect
other postmasters? I might be missing something, so pardon by ignorance. But
should not bind() to same port fail with EADDRINUSE unless SO_REUSEADDR is
set? I don't really know if it is set in postgres or not ...
-- igor
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
Could we get away with saying that the Unix-socket-less platforms have
weaker protection against mistakenly restarting the postmaster?
Why can't we use named pipe (aka FIFO file) instead of UDS?
That's exactly what I'm talking about.
Another thought is, why can't we use bind() to the postmaster port to detect
other postmasters?
Because port number and data directory are independent parameters. The
interlock on port number is not related to the interlock on data
directory.
regards, tom lane
On Mon, 6 May 2002, Tom Lane wrote:
As a backend is started up, connect to that socket ... if socket is open
when trying to start a new frontend, fail as there are currently other
connections attached to it?But the backends would only have the socket open, they'd not be
actively listening to it. So how could you tell whether anyone
had the socket open or not?
It's easy. As startup, the postmaster (or standalone
backend) creates a Unix socket, binds it to the filename
and calls listen on it.
If another backend is running, it'll get EADDRINUSE from
the bind or listen.
Nobody actually needs to connect to the socket. Simple,
race-free, 10 lines of code.
Matthew.
Matthew Kirkwood <matthew@hairy.beasts.org> writes:
Nobody actually needs to connect to the socket. Simple,
race-free, 10 lines of code.
... and we already do it. But it protects the port number, not
the data directory.
regards, tom lane
On Tue, 7 May 2002, Tom Lane wrote:
Nobody actually needs to connect to the socket. Simple,
race-free, 10 lines of code.... and we already do it. But it protects the port number, not
the data directory.
If I understood him correctly, Marc was suggesting a further
domain socket inside the data directory.
Matthew.
Matthew Kirkwood <matthew@hairy.beasts.org> writes:
... and we already do it. But it protects the port number, not
the data directory.
If I understood him correctly, Marc was suggesting a further
domain socket inside the data directory.
Right, and that would work because we would reference it as
$PGDATA/.socket --- exact, one-to-one correspondence between data
directory and interlock file. A TCP socket isn't going to have any
such direct connection to the data directory.
We could try to make such a connection (eg, pick a free port number at
random, and record the number in a lockfile in $PGDATA). But that will
suffer from a bunch of failure modes, starting with the same one that's
been biting us for PID interlocking: after a system restart, someone
else may hold the port number that we chose at random last time.
Basically, the reason that we want this interlock is because we are
going after five-nines kind of reliability. An interlock technology
that's not itself five-nines reliable isn't going to make things better.
regards, tom lane
Just a friendly reminder that it should be named pipe rather than UDS ;)
-- igor
Show quoted text
Matthew Kirkwood <matthew@hairy.beasts.org> writes:
... and we already do it. But it protects the port number, not
the data directory.If I understood him correctly, Marc was suggesting a further
domain socket inside the data directory.Right, and that would work because we would reference it as
$PGDATA/.socket --- exact, one-to-one correspondence between data
directory and interlock file. A TCP socket isn't going to have any
such direct connection to the data directory.We could try to make such a connection (eg, pick a free port number at
random, and record the number in a lockfile in $PGDATA). But that will
suffer from a bunch of failure modes, starting with the same one that's
been biting us for PID interlocking: after a system restart, someone
else may hold the port number that we chose at random last time.Basically, the reason that we want this interlock is because we are
going after five-nines kind of reliability. An interlock technology
that's not itself five-nines reliable isn't going to make things better.regards, tom lane
On Tue, 7 May 2002, Igor Kovalenko wrote:
Just a friendly reminder that it should be named pipe rather than UDS
;)
Named pipes don't have the required syntax. Perhaps for
platforms which have neither SysV shm, something like
POSIX named semaphores are the way forward.
Matthew.
Can you be more specific? What required syntax? I was talking about named
pipe vs UDS socket...
Show quoted text
On Tue, 7 May 2002, Igor Kovalenko wrote:
Just a friendly reminder that it should be named pipe rather than UDS
;)Named pipes don't have the required syntax. Perhaps for
platforms which have neither SysV shm, something like
POSIX named semaphores are the way forward.Matthew.
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
I was talking about named pipe vs UDS socket...
Aren't those the same thing? You get a socket file either way.
regards, tom lane
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
I was talking about named pipe vs UDS socket...
Aren't those the same thing? You get a socket file either way.
On QNX named pipe will have type 'FIFO file', which has similar features to
a socket indeed but implemented differently but that is not the point. On
SysV derivatives they all will be implemented as 2 connected STREAMS heads.
On BSD they both will be same thing. Not sure about other systems. The UDS
API however was originally limited to BSD4.3 and only later started to
spread, whereas named pipes have been around longer and probably exist in
any Unix variant and probably other types of systems.
-- igor
On Wed, 8 May 2002, Igor Kovalenko wrote:
Can you be more specific? What required syntax? I was talking about
named pipe vs UDS socket...
Sorry, I meant semantics.
A pipe can have multiple readers and multiple writers. This is
no use for us.
A listening SOCK_STREAM Unix domain socket can have no readers or
writers, but only one listener (well, except that other processes
can inherit or be passed the socket). You have to connect() (and
the server must accept()) before read and write do anything. But
we have no use for that here. It's just an exclusive-only mutex
whose namespace is the filesystem.
It really is like a TCP socket, except that the address namespace
is the filesystem, and thus it's not available remotely.
Think of it as a TCP socket without the "which address and port
do I use, and how do I keep it secure" issues.
Matthew.
Ahh... you want a named semaphore... There is such a thing in POSIX but it
is only portable if their names begin with "/" (which tells OS to put it
where appropriate). I believe without leading slash they end up in current
directory, but we can't rely on that... too bad. Glad UDS it is getting
supported on my platform, lol ;)
This will however leave QNX4 in the dust, if anyone cares. And most likely
BeOS, MP/X and half dozen other platforms. Which prompts me to think if it
would not be better to come up with a platform independent 'namespace sync'
mechanism. Can't we use fcntl()-based lock for that purpose? That's what
apache is doing apparently (one of variants).
-- igor
Show quoted text
On Wed, 8 May 2002, Igor Kovalenko wrote:
Can you be more specific? What required syntax? I was talking about
named pipe vs UDS socket...Sorry, I meant semantics.
A pipe can have multiple readers and multiple writers. This is
no use for us.A listening SOCK_STREAM Unix domain socket can have no readers or
writers, but only one listener (well, except that other processes
can inherit or be passed the socket). You have to connect() (and
the server must accept()) before read and write do anything. But
we have no use for that here. It's just an exclusive-only mutex
whose namespace is the filesystem.It really is like a TCP socket, except that the address namespace
is the filesystem, and thus it's not available remotely.Think of it as a TCP socket without the "which address and port
do I use, and how do I keep it secure" issues.Matthew.
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
Can't we use fcntl()-based lock for that purpose?
I'm pretty sure that fcntl locking has an evil reputation as well.
(Didn't we use that up till a couple years ago, and give up on it?)
regards, tom lane
Tom Lane wrote:
"Igor Kovalenko" <Igor.Kovalenko@motorola.com> writes:
I was talking about named pipe vs UDS socket...
Aren't those the same thing? You get a socket file either way.
No they are not. The former is a FIFO file, the latter a
socket. FIFO's can be used via open(2), sockets via
connect(2). And as said before, FIFO's are there since UNIX
Version 7 (at least, I haven't been around before that). So
there is a good chance that these are available on every
UNIX.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
Igor Kovalenko wrote:
It does not have to be anonymous. POSIX also defines shm_open(same arguments
as open) API which will create named object in whatever location corresponds
to shared memory storage on that platform (object is then grown to needed
size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to shm_open()
with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).
Actually, I think the best shared memory implemention would be
MAP_ANON | MAP_SHARED mmap(), which could be called from the postmaster
and passed to child processes.
While all our platforms have mmap(), many don't have MAP_ANON, but those
that do could use it. You need MAP_ANON to prevent the shared memory
from being written to a disk file.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.That being said, a SysV IPC interface for native Windows would be kind of cool
to have.
I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
I think its already been determined that the cygwin option is too low
performing.
However, the apache stuff could be quite useful - but if that effort
were to be undertaken, it would make more sense to move all versions of the
code the
the apache runtime, for all platforms. Are there any other runtime
libraries out there
that are cross platform, open/free and high performance? I know the mozilla
XPCOM
libraries work quite nicely, but are geared more towards multithreaded
apps - and the
COM-alike infrastructure is something we wouldn't need.
~Jon
----- Original Message -----
From: Bruce Momjian <pgman@candle.pha.pa.us>
To: mlw <markw@mohawksoft.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; Marc G. Fournier <scrappy@hub.org>;
<pgsql-hackers@postgresql.org>
Sent: Sunday, June 02, 2002 8:49 PM
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll
write it
for Windows.
That being said, a SysV IPC interface for native Windows would be kind
of cool
Show quoted text
to have.
I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Import Notes
Resolved by subject fallback
You might want to go to the archives and catch up on the whole thread and
its digressions :)
On Sun, 2 Jun 2002, Bruce Momjian wrote:
Show quoted text
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.That being said, a SysV IPC interface for native Windows would be kind of cool
to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.That being said, a SysV IPC interface for native Windows would be kind of cool
to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.
I have not been participating on the list, I don't know why I'm still receiving
mail.
but! in the course of testing some code, I managed to gain some experience with
cygwin. I have seen fork() problems with a large number of processes.
For PostgreSQL to be as good on Windows as it is on UNIX, it has to be a native
program without cygwin. The shared memory and semaphore management should be
done with the postmaster process.
The apache stuff is OK, it is just as good as anything else. You may be able to
use critical sections in shared memory to implement a fast semaphore, but that
would take a bit experimentation.
I think what Tom had in mind is to take out the SysV and various OS specific
APIs and replace them with a more generic one, behind which, you guys can tune
the implementation.
Yes, I am having trouble figuring out if I have seen the whole thread yet.
---------------------------------------------------------------------------
Marc G. Fournier wrote:
You might want to go to the archives and catch up on the whole thread and
its digressions :)On Sun, 2 Jun 2002, Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.That being said, a SysV IPC interface for native Windows would be kind of cool
to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce,
On Sun, Jun 02, 2002 at 08:49:21PM -0400, Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write it
for Windows.That being said, a SysV IPC interface for native Windows would be kind of
cool to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.
Are you referring to cygipc above? If so, they even one of the original
cygipc authors would discourage this:
http://sources.redhat.com/ml/cygwin-apps/2001-09/msg00017.html
Specifically, Ludovic Lange states the following:
I really think the solution would be to start again from scratch
another implementation, as was suggested. The way we did it was
quick and dirty, the goals weren't to have production systems
running on it but only to run prototypes. So the internal design
(if there is any) may not be adequate for the cygwin project.
However, Rob Collins has contributed a MinGW daemon to Cygwin to support
switching users, System V IPC, etc. So, this code base may be a more
suitable starting point to satisfy PostgreSQL's native Win32 System V
IPC needs.
Jason
On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:
Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write
it for Windows.That being said, a SysV IPC interface for native Windows would be kind of
cool to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.
Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.
Jason
Jason Tishler wrote:
On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:
Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write
it for Windows.That being said, a SysV IPC interface for native Windows would be kind of
cool to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.
Why would not PostgreSQL be affected by this?
Jason Tishler wrote:
On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:
Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll write
it for Windows.That being said, a SysV IPC interface for native Windows would be kind of
cool to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.
Whatever technical problems there are, we can debate on and
on if it's worth working around them in PostgreSQL or fixing
them in CygWIN or whatever.
The main problem will remain. That using PostgreSQL under
CygWIN requires some UNIX know how. So a pure Windows
user/shop needs UNIX knowledge to run our "Windows port" of
PostgreSQL? Interesting definition of "port".
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
Hi,
You may want to have a look at: http://www.garret.ru/~knizhnik/
You find there code for a 'Fast synchronized access to shared
memory for Windows and for i86 Unix-es".
kind regards,
Robert
Show quoted text
Bruce,
On Sun, Jun 02, 2002 at 08:49:21PM -0400, Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll
write it for Windows.That being said, a SysV IPC interface for native Windows would be kind
of cool to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.Are you referring to cygipc above? If so, they even one of the original
cygipc authors would discourage this:http://sources.redhat.com/ml/cygwin-apps/2001-09/msg00017.html
Specifically, Ludovic Lange states the following:
I really think the solution would be to start again from scratch
another implementation, as was suggested. The way we did it was
quick and dirty, the goals weren't to have production systems
running on it but only to run prototypes. So the internal design
(if there is any) may not be adequate for the cygwin project.However, Rob Collins has contributed a MinGW daemon to Cygwin to support
switching users, System V IPC, etc. So, this code base may be a more
suitable starting point to satisfy PostgreSQL's native Win32 System V
IPC needs.Jason
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Hi,
Some of you might already know GOODS, programmed
almost entirely by Konstantin Knizhnik - if not you should
really have a look at it right now (be warned: consuming this
extraordinary work might change your levels about the
required quality of a 'good programmer' forever. At least
this happend to me... ;):
http://www.garret.ru/~knizhnik/goods.html
Some core features of this backend (as they come to my mind):
-> full ACID transaction support
-> distributed stoarge management (->distributed transactions)
-> multible reader/single writer (is this called MVCC within PostgreSQL?)
-> dual client side object cache
-> online backup (snapshot backup AND permanent backup)
-> nested transactions on object level
-> transaction isolation levels on object level
-> object level shared and exclusive locks
-> excellent C++ programming interface
-> WAL
-> garbage collection for no longer reference database objects
-> fully thread safe client interface
-> JAVA client API
-> very high performance as a result of a lot of fine tuning
-> asyncrous event notification on object instance modification
-> extremly high code quality
-> a one person effort, hence a very clean design
-> the most relevant platforms are supported out of the box
-> complete build is done in less than a minute on my machine
-> it's documented
...
The licensing of this coding wonder: >>> PUBLIC DOMAIN <<<
I'm using GOODS quiet a while now in the context of my
development activities for a native XML database and have
very promissing experiences concerning performance and
stability of GOODS. E.g.: The performance seems to be
better than sleepycat's berkeley db library - especially
with mutliple simultanous transactions...
Maybe the only restriction to use this backend in postgres
from now on: it's completely C++ ...
I'm wondering why there is no SQL frontend yet for this
execellent backend...
You may want to look also at a comparision chart of some
other backends than GOODS (some of them from the same
author!!! I'm wondering how he was able to code all this...):
http://www.garret.ru/~knizhnik/compare.html
kind regards,
Robert
On Mon, Jun 03, 2002 at 09:36:51AM -0400, mlw wrote:
Jason Tishler wrote:
On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:
Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll
write it for Windows.That being said, a SysV IPC interface for native Windows would be
kind of cool to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.Why would not PostgreSQL be affected by this?
Sorry, if I was unclear -- I should have used two paragraphs above and
maybe a few more words... :,)
Cygwin PostgreSQL *is* affected by the Cygwin 63 children per parent
fork limitation.
PostgreSQL *can* be affected by the Cygwin DLL base address conflict
fork issue, but in my experience (both personal and by monitoring the
Cygwin and pgsql-cygwin lists), no one has been affected yet. The DLL
base address conflict is a "probability" thing. The more DLLs loaded
the greater the chance of a conflict (and fork() failing). Since, Cygwin
PostgreSQL loads only a few DLLs, this has not become an issue (yet).
Jason
Kostya is a good qualified programmer. I know him and he is always open for
challenges. Some time ago, me and Teodor ask him about GiST support
in his another database (Gigabase). It was sort of challenge ( we wanted
to port our contrib/tsearch module ) and he did that (using libgist).
We work with gigabase database embedded into our application under
Windows (we had a lot of troubles with perforance of postgresql under
Cygwin:-) and quite happy.
On Mon, 3 Jun 2002, Robert Schrem wrote:
Hi,
Some of you might already know GOODS, programmed
almost entirely by Konstantin Knizhnik - if not you should
really have a look at it right now (be warned: consuming this
extraordinary work might change your levels about the
required quality of a 'good programmer' forever. At least
this happend to me... ;):
http://www.garret.ru/~knizhnik/goods.htmlSome core features of this backend (as they come to my mind):
-> full ACID transaction support
-> distributed stoarge management (->distributed transactions)
-> multible reader/single writer (is this called MVCC within PostgreSQL?)
-> dual client side object cache
-> online backup (snapshot backup AND permanent backup)
-> nested transactions on object level
-> transaction isolation levels on object level
-> object level shared and exclusive locks
-> excellent C++ programming interface
-> WAL
-> garbage collection for no longer reference database objects
-> fully thread safe client interface
-> JAVA client API
-> very high performance as a result of a lot of fine tuning
-> asyncrous event notification on object instance modification
-> extremly high code quality
-> a one person effort, hence a very clean design
-> the most relevant platforms are supported out of the box
-> complete build is done in less than a minute on my machine
-> it's documented
...The licensing of this coding wonder: >>> PUBLIC DOMAIN <<<
I'm using GOODS quiet a while now in the context of my
development activities for a native XML database and have
very promissing experiences concerning performance and
stability of GOODS. E.g.: The performance seems to be
better than sleepycat's berkeley db library - especially
with mutliple simultanous transactions...Maybe the only restriction to use this backend in postgres
from now on: it's completely C++ ...I'm wondering why there is no SQL frontend yet for this
execellent backend...You may want to look also at a comparision chart of some
other backends than GOODS (some of them from the same
author!!! I'm wondering how he was able to code all this...):
http://www.garret.ru/~knizhnik/compare.htmlkind regards,
Robert
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Jason Tishler wrote:
On Mon, Jun 03, 2002 at 09:36:51AM -0400, mlw wrote:
Jason Tishler wrote:
On Sun, Jun 02, 2002 at 09:33:57PM -0400, mlw wrote:
Bruce Momjian wrote:
mlw wrote:
Like I told Marc, I don't care. You spec out what you want and I'll
write it for Windows.That being said, a SysV IPC interface for native Windows would be
kind of cool to have.I am wondering why we don't just use the Cygwin shm/sem code in our
project, or maybe the Apache stuff; why bother reinventing the wheel.but! in the course of testing some code, I managed to gain some experience
with cygwin. I have seen fork() problems with a large number of processes.Since Cygwin's fork() is implemented with WaitForMultipleObjects(),
it has a limitation of only 63 children per parent. Also, there can
be DLL base address conflicts (causing Cygwin fork() to fail) that are
avoidable by rebasing the appropriate DLLs. AFAICT, Cygwin PostgreSQL is
currently *not* affected by this issue where as other Cygwin applications
such as Python and Apache are.Why would not PostgreSQL be affected by this?
Sorry, if I was unclear -- I should have used two paragraphs above and
maybe a few more words... :,)Cygwin PostgreSQL *is* affected by the Cygwin 63 children per parent
fork limitation.PostgreSQL *can* be affected by the Cygwin DLL base address conflict
fork issue, but in my experience (both personal and by monitoring the
Cygwin and pgsql-cygwin lists), no one has been affected yet. The DLL
base address conflict is a "probability" thing. The more DLLs loaded
the greater the chance of a conflict (and fork() failing). Since, Cygwin
PostgreSQL loads only a few DLLs, this has not become an issue (yet).
I'm not sure the DLL load address is a big issue for PostgreSQL, AFAIK no
option DLLs will be loaded by Postmaster. So, with fork() it will be a simple
process. A PostgreSQL child will die upon completion, and never execute fork().
My concern would be the limit on the number of child processes allowed. 63 is
far below what would be considered a usable number in production, and as long
as that is an issue, I don't think anyone would take PostgreSQL seriously.
A Windows version of PostgreSQL must run within the confines of the Windows OS.
The reason, IMHO, that no one has found any serious bugs in the cygwin version,
is because no one is seriously using it. Anyone who *would* seriously use it,
knows better.
That's what Apache does. Note, on most platforms MAP_ANON is equivalent to
mmmap-ing /dev/zero. Solaris for example does not provide MAP_ANON but using
fd=open(/dev/zero)
mmap(fd, ...)
close(fd)
works just fine.
----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Igor Kovalenko" <Igor.Kovalenko@motorola.com>
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; "mlw" <markw@mohawksoft.com>; "Marc G.
Fournier" <scrappy@hub.org>; <pgsql-hackers@postgresql.org>
Sent: Sunday, June 02, 2002 7:47 PM
Subject: Re: [HACKERS] HEADS UP: Win32/OS2/BeOS native ports
Igor Kovalenko wrote:
It does not have to be anonymous. POSIX also defines shm_open(same
arguments
as open) API which will create named object in whatever location
corresponds
to shared memory storage on that platform (object is then grown to
needed
size by ftruncate() and the fd is then passed to mmap). The object will
exist in name space and can be detected by subsequent calls to
shm_open()
Show quoted text
with same name. It is not really different from doing open(), but more
portable (mmap() on regular files may not be supported).Actually, I think the best shared memory implemention would be
MAP_ANON | MAP_SHARED mmap(), which could be called from the postmaster
and passed to child processes.While all our platforms have mmap(), many don't have MAP_ANON, but those
that do could use it. You need MAP_ANON to prevent the shared memory
from being written to a disk file.-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026