Dynamic Shared Memory stuff

Started by Heikki Linnakangasabout 12 years ago35 messages
#1Heikki Linnakangas
hlinnakangas@vmware.com

I'm trying to catch up on all of this dynamic shared memory stuff. A
bunch of random questions and complaints:

What kind of usage are we trying to cater with the dynamic shared
memory? How many allocations? What size will they have have typically,
minimum and maximum? I looked at the message queue implementation you
posted, but I wonder if that's the use case you're envisioning for this,
or if you have more things in mind.

* dsm_handle is defined in dsm_impl.h, but it's exposed in the function
signatures in dsm.h. ISTM it should be moved to dsm.h

* The DSM API contains functions for resizing the segment. That's not
exercised by the MQ or TOC facilities. Is that going to stay dead code,
or do you envision a user for it?

* dsm_impl_can_resize() incorrectly returns false for DSM_IMPL_MMAP. The
mmap() implementation can resize.

* This is an issue I've seen for some time with git master, while
working on various things. Sometimes, when I kill the server with
CTRL-C, I get this in the log:

^CLOG: received fast shutdown request
LOG: aborting any active transactions
FATAL: terminating connection due to administrator command
LOG: autovacuum launcher shutting down
LOG: shutting down
LOG: database system is shut down
LOG: could not remove shared memory segment "/PostgreSQL.1804289383":
Tiedostoa tai hakemistoa ei ole

(that means ENOENT)

And I just figured out why that happens: If you take a base backup of a
running system, the pg_dynshmem/state file is included in the backup. If
you now start up a standby from the backup on the same system, it will
"clean up" and reuse the dynshmem segment still used by the master
system. Now, when you shut down the master, you get that message in the
log. If the segment was actually used for something, the master would
naturally crash.

* As discussed in the "Something fishy happening on frogmouth" thread, I
don't like the fact that the dynamic shared memory segments will be
permanently leaked if you kill -9 postmaster and destroy the data directory.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#1)
Re: Dynamic Shared Memory stuff

On Wed, Nov 20, 2013 at 8:32 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

I'm trying to catch up on all of this dynamic shared memory stuff. A bunch
of random questions and complaints:

What kind of usage are we trying to cater with the dynamic shared memory?

Parallel sort, and then parallel other stuff. Eventually general
parallel query.

I have recently updated https://wiki.postgresql.org/wiki/Parallel_Sort
and you may find that interesting/helpful as a statement of intent.

How many allocations? What size will they have have typically, minimum and
maximum?

The facility is intended to be general, so the answer could vary
widely by application. The testing that I have done so far suggests
that for message-passing, relatively small queue sizes (a few kB,
perhaps 1 MB at the outside) should be sufficient. However,
applications such as parallel sort could require vast amounts of
shared memory. Consider a machine with 1TB of memory performing a
512GB internal sort. You're going to need 512GB of shared memory for
that.

I looked at the message queue implementation you posted, but I
wonder if that's the use case you're envisioning for this, or if you have
more things in mind.

I consider that to be the first application of dynamic shared memory
and expect it to be used for (1) returning errors from background
workers to the user backend and (2) funneling tuples from one portion
of a query tree that has been split off to run in a background worker
back to the user backend. However, I expect that many clients of the
dynamic shared memory system will want to roll their own data
structures. Parallel internal sort (or external sort) is obviously
one, and in addition to that we might have parallel construction of
in-memory hash tables for a hash join or hash agg, or, well, anything
else you'd like to parallelize. I expect that many of those case will
result in much larger allocations than what we need just for message
passing.

* dsm_handle is defined in dsm_impl.h, but it's exposed in the function
signatures in dsm.h. ISTM it should be moved to dsm.h

Well, dsm_impl.h is the low-level stuff, and dsm.h is intended as the
user API. Unfortunately, whichever file declares that will have to be
included by the other one, so the separation is not as clean as I
would like, but I thought it made more sense for the high-level stuff
to depend on the low-level stuff rather than the other way around.

* The DSM API contains functions for resizing the segment. That's not
exercised by the MQ or TOC facilities. Is that going to stay dead code, or
do you envision a user for it?

I dunno. It certainly seems like a useful thing to be able to do - if
we run out of memory, go get some more. It'd obviously be more useful
if we had a full-fledged allocator for dynamic shared memory, which is
something that I'd like to build but haven't built yet. However,
after discovering that it doesn't work either on Windows or with
System V shared memory, I'm less sanguine about the chances of finding
good uses for it. I haven't completely given up hope, but I don't
have anything concrete in mind at the moment. It'd be a little more
plausible if we adjusted things so that the mmap() implementation
works on Windows.

* dsm_impl_can_resize() incorrectly returns false for DSM_IMPL_MMAP. The
mmap() implementation can resize.

Oops, that's a bug.

* This is an issue I've seen for some time with git master, while working on
various things. Sometimes, when I kill the server with CTRL-C, I get this in
the log:

^CLOG: received fast shutdown request
LOG: aborting any active transactions
FATAL: terminating connection due to administrator command
LOG: autovacuum launcher shutting down
LOG: shutting down
LOG: database system is shut down
LOG: could not remove shared memory segment "/PostgreSQL.1804289383":
Tiedostoa tai hakemistoa ei ole

(that means ENOENT)

And I just figured out why that happens: If you take a base backup of a
running system, the pg_dynshmem/state file is included in the backup. If you
now start up a standby from the backup on the same system, it will "clean
up" and reuse the dynshmem segment still used by the master system. Now,
when you shut down the master, you get that message in the log. If the
segment was actually used for something, the master would naturally crash.

Ooh. Well, pg_basebackup can be fixed not to copy that, but there's
still going to be a problem with old-style base backups. We could try
to figure out some additional sanity check for the dsm code to use, to
determine whether or not it belongs to the same cluster, like storing
the port number or the system identifier or some other value in the
shared memory segment and then comparing it to verify whether we've
got the same one. Or perhaps we could store the PID of the creating
postmaster in there and check whether that PID is still alive,
although we might get confused if the PID has been recycled.

* As discussed in the "Something fishy happening on frogmouth" thread, I
don't like the fact that the dynamic shared memory segments will be
permanently leaked if you kill -9 postmaster and destroy the data directory.

Your test elicited different behavior for the dsm code vs. the main
shared memory segment because it involved running a new postmaster
with a different data directory but the same port number on the same
machine, and expecting that that new - and completely unrelated -
postmaster would clean up the resources left behind by the old,
now-destroyed cluster. I tend to view that as a defect in your test
case more than anything else, but as I suggested previously, we could
potentially change the code to use something like 1000000 + (port *
100) with a forward search for the control segment identifier, instead
of using a state file, mimicking the behavior of the main shared
memory segment. I'm not sure we ever reached consensus on whether
that was overall better than what we have now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Jeremy Harris
jgh@wizmail.org
In reply to: Robert Haas (#2)
Re: Dynamic Shared Memory stuff

On 20/11/13 19:58, Robert Haas wrote:

Parallel sort, and then parallel other stuff. Eventually general
parallel query.

I have recently updated https://wiki.postgresql.org/wiki/Parallel_Sort
and you may find that interesting/helpful as a statement of intent.

I've been playing with an internal merge sort which may be of interest
in this area of work. While I've not worked on parallelizing the merge
stages it does not feel like an impossible task. More to the immediate
point, the input stage and readout stage can do useful work
overlapped with the data source and sink (respectively).

The current implementation uses the same amount of memory as the
quicksort one, and does approximately the same number of compares
(on random input). The overheads are higher than the quicksort
implementation (up to 50% higher cpu time, on a single key of
random integers).

Its performance shines on partially- or reverse-sorted input.

Transition to (the existing) external sort is implemented, as is
the limited random-access facility, and bounded output.

It supports unique-output. The planner interface is extended to
return an indication of dedup guarantee, and the Unique node uses
this to elide itself at planning time. Dedup operations also
cover external sorts.
--
Cheers,
Jeremy

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Peter Geoghegan
pg@heroku.com
In reply to: Jeremy Harris (#3)
Re: Dynamic Shared Memory stuff

On Sat, Nov 23, 2013 at 4:21 PM, Jeremy Harris <jgh@wizmail.org> wrote:

Its performance shines on partially- or reverse-sorted input.

Search the archives for the work I did on timsort support a while
back. A patch was posted, that had some impressive results provided
you just considered the number of comparisons (and not TPS when
sorting text), but at the time my sense was that it didn't have broad
enough applicability for me to pursue further. That doesn't mean the
idea wasn't useful, and it certainly doesn't mean that my rough patch
couldn't be improved upon. For one thing, if there was a type that had
a comparator that was, say, an order of magnitude more expensive than
bttextcmp, it would definitely be a big win.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Robert Haas (#2)
Re: Dynamic Shared Memory stuff

On 11/20/2013 09:58 PM, Robert Haas wrote:

On Wed, Nov 20, 2013 at 8:32 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

How many allocations? What size will they have have typically, minimum and
maximum?

The facility is intended to be general, so the answer could vary
widely by application. The testing that I have done so far suggests
that for message-passing, relatively small queue sizes (a few kB,
perhaps 1 MB at the outside) should be sufficient. However,
applications such as parallel sort could require vast amounts of
shared memory. Consider a machine with 1TB of memory performing a
512GB internal sort. You're going to need 512GB of shared memory for
that.

Hmm. Those two use cases are quite different. For message-passing, you
want a lot of small queues, but for parallel sort, you want one huge
allocation. I wonder if we shouldn't even try a one-size-fits-all solution.

For message-passing, there isn't much need to even use dynamic shared
memory. You could just assign one fixed-sized, single-reader
multiple-writer queue for each backend.

For parallel sort, you'll want to utilize all the available memory and
all CPUs for one huge sort. So all you really need is a single huge
shared memory segment. If one process is already using that 512GB
segment to do a sort, you do *not* want to allocate a second 512GB
segment. You'll want to wait for the first operation to finish first. Or
maybe you'll want to have 3-4 somewhat smaller segments in use at the
same time, but not more than that.

* As discussed in the "Something fishy happening on frogmouth" thread, I
don't like the fact that the dynamic shared memory segments will be
permanently leaked if you kill -9 postmaster and destroy the data directory.

Your test elicited different behavior for the dsm code vs. the main
shared memory segment because it involved running a new postmaster
with a different data directory but the same port number on the same
machine, and expecting that that new - and completely unrelated -
postmaster would clean up the resources left behind by the old,
now-destroyed cluster. I tend to view that as a defect in your test
case more than anything else, but as I suggested previously, we could
potentially change the code to use something like 1000000 + (port *
100) with a forward search for the control segment identifier, instead
of using a state file, mimicking the behavior of the main shared
memory segment. I'm not sure we ever reached consensus on whether
that was overall better than what we have now.

I really think we need to do something about it. To use your earlier
example of parallel sort, it's not acceptable to permanently leak a 512
GB segment on a system with 1 TB of RAM.

One idea is to create the shared memory object with shm_open, and wait
until all the worker processes that need it have attached to it. Then,
shm_unlink() it, before using it for anything. That way the segment will
be automatically released once all the processes close() it, or die. In
particular, kill -9 will release it. (This is a variant of my earlier
idea to create a small number of anonymous shared memory file
descriptors in postmaster startup with shm_open(), and pass them down to
child processes with fork()). I think you could use that approach with
SysV shared memory as well, by destroying the segment with
sgmget(IPC_RMID) immediately after all processes have attached to it.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#5)
Re: Dynamic Shared Memory stuff

On Thu, Dec 5, 2013 at 11:12 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

Hmm. Those two use cases are quite different. For message-passing, you want
a lot of small queues, but for parallel sort, you want one huge allocation.
I wonder if we shouldn't even try a one-size-fits-all solution.

For message-passing, there isn't much need to even use dynamic shared
memory. You could just assign one fixed-sized, single-reader multiple-writer
queue for each backend.

True, although if the queue needs to 1MB, or even 128kB, that would
bloat the static shared-memory footprint over the server pretty
significantly. And I don't know that we know that a small queue will
be adequate in all cases. If you've got a worker backend feeding data
back to the user backend, the size of the queue limits how far ahead
of the user backend that worker can get. Big is good, because then
the user backend won't stall on read, but small is also good, in case
the query is cancelled or hits an error. It is far from obvious to me
that one-size-fits-all is the right solution.

For parallel sort, you'll want to utilize all the available memory and all
CPUs for one huge sort. So all you really need is a single huge shared
memory segment. If one process is already using that 512GB segment to do a
sort, you do *not* want to allocate a second 512GB segment. You'll want to
wait for the first operation to finish first. Or maybe you'll want to have
3-4 somewhat smaller segments in use at the same time, but not more than
that.

This is all true, but it has basically nothing to do with parallelism.
work_mem is a poor model, but I didn't invent it. Hopefully some day
someone will fix it, maybe even me, but that's a separate project.

I really think we need to do something about it. To use your earlier example
of parallel sort, it's not acceptable to permanently leak a 512 GB segment
on a system with 1 TB of RAM.

One idea is to create the shared memory object with shm_open, and wait until
all the worker processes that need it have attached to it. Then,
shm_unlink() it, before using it for anything. That way the segment will be
automatically released once all the processes close() it, or die. In
particular, kill -9 will release it. (This is a variant of my earlier idea
to create a small number of anonymous shared memory file descriptors in
postmaster startup with shm_open(), and pass them down to child processes
with fork()). I think you could use that approach with SysV shared memory as
well, by destroying the segment with sgmget(IPC_RMID) immediately after all
processes have attached to it.

That's a very interesting idea. I've been thinking that we needed to
preserve the property that new workers could attach to the shared
memory segment at any time, but that might not be necessary in all
case. We could introduce a new dsm operation that means "i promise no
one else needs to attach to this segment". Further attachments would
be disallowed by dsm.c regardless of the implementation in use, and
dsm_impl.c would also be given a chance to perform
implementation-specific operations, like shm_unlink and
shmctl(IPC_RMID). This new operation, when used, would help to reduce
the chance of leaks and perhaps catch other programming errors as
well.

What should we call it? dsm_finalize() is the first thing that comes
to mind, but I'm not sure I like that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Robert Haas (#6)
Re: Dynamic Shared Memory stuff

On 12/05/2013 09:34 PM, Robert Haas wrote:

On Thu, Dec 5, 2013 at 11:12 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

One idea is to create the shared memory object with shm_open, and wait until
all the worker processes that need it have attached to it. Then,
shm_unlink() it, before using it for anything. That way the segment will be
automatically released once all the processes close() it, or die. In
particular, kill -9 will release it. (This is a variant of my earlier idea
to create a small number of anonymous shared memory file descriptors in
postmaster startup with shm_open(), and pass them down to child processes
with fork()). I think you could use that approach with SysV shared memory as
well, by destroying the segment with sgmget(IPC_RMID) immediately after all
processes have attached to it.

That's a very interesting idea. I've been thinking that we needed to
preserve the property that new workers could attach to the shared
memory segment at any time, but that might not be necessary in all
case. We could introduce a new dsm operation that means "i promise no
one else needs to attach to this segment". Further attachments would
be disallowed by dsm.c regardless of the implementation in use, and
dsm_impl.c would also be given a chance to perform
implementation-specific operations, like shm_unlink and
shmctl(IPC_RMID). This new operation, when used, would help to reduce
the chance of leaks and perhaps catch other programming errors as
well.

What should we call it? dsm_finalize() is the first thing that comes
to mind, but I'm not sure I like that.

dsm_unlink() would mirror the underlying POSIX shm_unlink() call, and
would be familiar to anyone who understands how unlink() on a file works
on Unix.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#7)
Re: Dynamic Shared Memory stuff

On Thu, Dec 5, 2013 at 4:06 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

That's a very interesting idea. I've been thinking that we needed to
preserve the property that new workers could attach to the shared
memory segment at any time, but that might not be necessary in all
case. We could introduce a new dsm operation that means "i promise no
one else needs to attach to this segment". Further attachments would
be disallowed by dsm.c regardless of the implementation in use, and
dsm_impl.c would also be given a chance to perform
implementation-specific operations, like shm_unlink and
shmctl(IPC_RMID). This new operation, when used, would help to reduce
the chance of leaks and perhaps catch other programming errors as
well.

What should we call it? dsm_finalize() is the first thing that comes
to mind, but I'm not sure I like that.

dsm_unlink() would mirror the underlying POSIX shm_unlink() call, and would
be familiar to anyone who understands how unlink() on a file works on Unix.

OK, let me work on that once this CommitFest is done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#5)
Re: Dynamic Shared Memory stuff

On Thu, Dec 05, 2013 at 06:12:48PM +0200, Heikki Linnakangas wrote:

On 11/20/2013 09:58 PM, Robert Haas wrote:

On Wed, Nov 20, 2013 at 8:32 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

* As discussed in the "Something fishy happening on frogmouth" thread, I
don't like the fact that the dynamic shared memory segments will be
permanently leaked if you kill -9 postmaster and destroy the data directory.

Your test elicited different behavior for the dsm code vs. the main
shared memory segment because it involved running a new postmaster
with a different data directory but the same port number on the same
machine, and expecting that that new - and completely unrelated -
postmaster would clean up the resources left behind by the old,
now-destroyed cluster. I tend to view that as a defect in your test
case more than anything else, but as I suggested previously, we could
potentially change the code to use something like 1000000 + (port *
100) with a forward search for the control segment identifier, instead
of using a state file, mimicking the behavior of the main shared
memory segment. I'm not sure we ever reached consensus on whether
that was overall better than what we have now.

I really think we need to do something about it. To use your earlier
example of parallel sort, it's not acceptable to permanently leak a 512
GB segment on a system with 1 TB of RAM.

I don't. Erasing your data directory after an unclean shutdown voids any
expectations for a thorough, automatic release of system resources. Don't do
that. The next time some new use of a persistent resource violates your hope
for this scenario, there may be no remedy.

Having said that, I'm not opposed to storing the DSM control segment handle in
PGShmemHeader, which would enable DSM cleanup whenever we find cause to
reclaim a main sysv segment.

One idea is to create the shared memory object with shm_open, and wait
until all the worker processes that need it have attached to it. Then,
shm_unlink() it, before using it for anything. That way the segment will
be automatically released once all the processes close() it, or die. In
particular, kill -9 will release it. (This is a variant of my earlier
idea to create a small number of anonymous shared memory file
descriptors in postmaster startup with shm_open(), and pass them down to
child processes with fork()). I think you could use that approach with
SysV shared memory as well, by destroying the segment with
sgmget(IPC_RMID) immediately after all processes have attached to it.

That leaves a window in which we still leak the segment, and it is less
general: not every use of DSM is conducive to having all processes attach in a
short span of time.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Noah Misch (#9)
Re: Dynamic Shared Memory stuff

On 12/10/2013 07:27 PM, Noah Misch wrote:

On Thu, Dec 05, 2013 at 06:12:48PM +0200, Heikki Linnakangas wrote:

On 11/20/2013 09:58 PM, Robert Haas wrote:

On Wed, Nov 20, 2013 at 8:32 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

* As discussed in the "Something fishy happening on frogmouth" thread, I
don't like the fact that the dynamic shared memory segments will be
permanently leaked if you kill -9 postmaster and destroy the data directory.

Your test elicited different behavior for the dsm code vs. the main
shared memory segment because it involved running a new postmaster
with a different data directory but the same port number on the same
machine, and expecting that that new - and completely unrelated -
postmaster would clean up the resources left behind by the old,
now-destroyed cluster. I tend to view that as a defect in your test
case more than anything else, but as I suggested previously, we could
potentially change the code to use something like 1000000 + (port *
100) with a forward search for the control segment identifier, instead
of using a state file, mimicking the behavior of the main shared
memory segment. I'm not sure we ever reached consensus on whether
that was overall better than what we have now.

I really think we need to do something about it. To use your earlier
example of parallel sort, it's not acceptable to permanently leak a 512
GB segment on a system with 1 TB of RAM.

I don't. Erasing your data directory after an unclean shutdown voids any
expectations for a thorough, automatic release of system resources. Don't do
that. The next time some new use of a persistent resource violates your hope
for this scenario, there may be no remedy.

Well, the point of erasing the data directory is to release system
resources. I would normally expect "killall -9 <process>; rm -rf <data
dir>" to thorougly get rid of the running program and all the resources.
It's surprising enough that the regular shared memory segment is left
behind, but at least that one gets cleaned up when you start a new
server (on same port). Let's not add more cases like that, if we can
avoid it.

BTW, what if the data directory is seriously borked, and the server
won't start? Sure, don't do that, but it would be nice to have a way to
recover if you do anyway. (docs?)

One idea is to create the shared memory object with shm_open, and wait
until all the worker processes that need it have attached to it. Then,
shm_unlink() it, before using it for anything. That way the segment will
be automatically released once all the processes close() it, or die. In
particular, kill -9 will release it. (This is a variant of my earlier
idea to create a small number of anonymous shared memory file
descriptors in postmaster startup with shm_open(), and pass them down to
child processes with fork()). I think you could use that approach with
SysV shared memory as well, by destroying the segment with
sgmget(IPC_RMID) immediately after all processes have attached to it.

That leaves a window in which we still leak the segment,

A small window is better than a large one.

Another refinement is to wait for all the processes to attach before
setting the segment's size with ftruncate(). That way, when the window
is open for leaking the segment, it's still 0-sized so leaking it is not
a big deal.

and it is less
general: not every use of DSM is conducive to having all processes attach in a
short span of time.

Let's cross that bridge when we get there. AFAICS it fits all the use
cases discussed this far.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#10)
Re: Dynamic Shared Memory stuff

On Tue, Dec 10, 2013 at 07:50:20PM +0200, Heikki Linnakangas wrote:

On 12/10/2013 07:27 PM, Noah Misch wrote:

On Thu, Dec 05, 2013 at 06:12:48PM +0200, Heikki Linnakangas wrote:

On Wed, Nov 20, 2013 at 8:32 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

* As discussed in the "Something fishy happening on frogmouth" thread, I
don't like the fact that the dynamic shared memory segments will be
permanently leaked if you kill -9 postmaster and destroy the data directory.

I really think we need to do something about it. To use your earlier
example of parallel sort, it's not acceptable to permanently leak a 512
GB segment on a system with 1 TB of RAM.

I don't. Erasing your data directory after an unclean shutdown voids any
expectations for a thorough, automatic release of system resources. Don't do
that. The next time some new use of a persistent resource violates your hope
for this scenario, there may be no remedy.

Well, the point of erasing the data directory is to release system
resources. I would normally expect "killall -9 <process>; rm -rf
<data dir>" to thorougly get rid of the running program and all the
resources. It's surprising enough that the regular shared memory
segment is left behind

Your expectation is misplaced. Processes and files are simply not the only
persistent system resources of interest.

but at least that one gets cleaned up when
you start a new server (on same port).

In the most-typical case, yes. In rare cases involving multiple postmasters
starting and stopping, the successor to the erased data directory will not
clean up the sysv segment.

Let's not add more cases like that, if we can avoid it.

Only if we can avoid it for a modicum of effort and feature compromise.
You're asking for PostgreSQL to reshape its use of persistent resources so you
can throw around "killall -9 postgres; rm -rf $PGDATA" without so much as a
memory leak. That use case, not PostgreSQL, has the defect here.

BTW, what if the data directory is seriously borked, and the server
won't start? Sure, don't do that, but it would be nice to have a way
to recover if you do anyway. (docs?)

If something is corrupting your data directory in an open-ended manner, you
have bigger problems than a memory leak until reboot. Recovering DSM happens
before we read the control file, so the damage would need to fall among a
short list of files for this to happen (bugs excluded). Nonetheless, I don't
object to documenting the varieties of system resources that PostgreSQL may
reserve and referencing the OS facilities for inspecting them.

Are you actually using PostgreSQL this way: frequent "killall -9 postgres; rm
-rf $PGDATA" after arbitrarily-bad $PGDATA corruption? Some automated fault
injection test rig, perhaps?

One idea is to create the shared memory object with shm_open, and wait
until all the worker processes that need it have attached to it. Then,
shm_unlink() it, before using it for anything. That way the segment will
be automatically released once all the processes close() it, or die. In
particular, kill -9 will release it. (This is a variant of my earlier
idea to create a small number of anonymous shared memory file
descriptors in postmaster startup with shm_open(), and pass them down to
child processes with fork()). I think you could use that approach with
SysV shared memory as well, by destroying the segment with
sgmget(IPC_RMID) immediately after all processes have attached to it.

That leaves a window in which we still leak the segment,

A small window is better than a large one.

Yes.

Another refinement is to wait for all the processes to attach before
setting the segment's size with ftruncate(). That way, when the
window is open for leaking the segment, it's still 0-sized so
leaking it is not a big deal.

and it is less
general: not every use of DSM is conducive to having all processes attach in a
short span of time.

Let's cross that bridge when we get there. AFAICS it fits all the
use cases discussed this far.

It does fit the use cases discussed thus far.

nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andres Freund
andres@2ndquadrant.com
In reply to: Noah Misch (#11)
Re: Dynamic Shared Memory stuff

On 2013-12-10 18:12:53 -0500, Noah Misch wrote:

On Tue, Dec 10, 2013 at 07:50:20PM +0200, Heikki Linnakangas wrote:

On 12/10/2013 07:27 PM, Noah Misch wrote:

On Thu, Dec 05, 2013 at 06:12:48PM +0200, Heikki Linnakangas wrote:

Let's not add more cases like that, if we can avoid it.

Only if we can avoid it for a modicum of effort and feature compromise.
You're asking for PostgreSQL to reshape its use of persistent resources so you
can throw around "killall -9 postgres; rm -rf $PGDATA" without so much as a
memory leak. That use case, not PostgreSQL, has the defect here.

Empathically seconded.

Another refinement is to wait for all the processes to attach before setting
the segment's size with ftruncate(). That way, when the window is open for
leaking the segment, it's still 0-sized so leaking it is not a big deal.

and it is less
general: not every use of DSM is conducive to having all processes attach in a
short span of time.

Let's cross that bridge when we get there. AFAICS it fits all the
use cases discussed this far.

The primary use case I have for dsm, namely writing extensions that can
use shared memory without having to be listed in
shared_preload_libraries, certainly wouldn't work in any sensible way
with such a restriction.
And I don't think that's an insignificant usecase.

So I really fail to see what this would buy us.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#11)
Re: Dynamic Shared Memory stuff

Noah Misch <noah@leadboat.com> writes:

On Tue, Dec 10, 2013 at 07:50:20PM +0200, Heikki Linnakangas wrote:

Let's not add more cases like that, if we can avoid it.

Only if we can avoid it for a modicum of effort and feature compromise.
You're asking for PostgreSQL to reshape its use of persistent resources so you
can throw around "killall -9 postgres; rm -rf $PGDATA" without so much as a
memory leak. That use case, not PostgreSQL, has the defect here.

The larger point is that such a shutdown process has never in the history
of Postgres been successful at removing shared-memory (or semaphore)
resources. I do not feel a need to put a higher recoverability standard
onto the DSM code than what we've had for fifteen years with SysV shmem.

But, by the same token, it shouldn't be any *less* recoverable. In this
context that means that starting a new postmaster should recover the
resources, at least 90% of the time (100% if we still have the old
postmaster lockfile). I think the idea of keeping enough info in the
SysV segment to permit recovery of DSM resources is a good one. Then,
any case where the existing logic is able to free the old SysV segment
is guaranteed to also work for DSM.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#13)
Re: Dynamic Shared Memory stuff

On Tue, Dec 10, 2013 at 6:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Noah Misch <noah@leadboat.com> writes:

On Tue, Dec 10, 2013 at 07:50:20PM +0200, Heikki Linnakangas wrote:

Let's not add more cases like that, if we can avoid it.

Only if we can avoid it for a modicum of effort and feature compromise.
You're asking for PostgreSQL to reshape its use of persistent resources so you
can throw around "killall -9 postgres; rm -rf $PGDATA" without so much as a
memory leak. That use case, not PostgreSQL, has the defect here.

The larger point is that such a shutdown process has never in the history
of Postgres been successful at removing shared-memory (or semaphore)
resources. I do not feel a need to put a higher recoverability standard
onto the DSM code than what we've had for fifteen years with SysV shmem.

But, by the same token, it shouldn't be any *less* recoverable. In this
context that means that starting a new postmaster should recover the
resources, at least 90% of the time (100% if we still have the old
postmaster lockfile). I think the idea of keeping enough info in the
SysV segment to permit recovery of DSM resources is a good one. Then,
any case where the existing logic is able to free the old SysV segment
is guaranteed to also work for DSM.

So I'm taking a look at this. There doesn't appear to be anything
intrinsically intractable here, but it seems important to time the
order of operations so as to minimize the chances that things fail in
awkward places. The point where we remove the old shared memory
segment from the previous postmaster invocation is here:

/*
* The segment appears to be from a dead Postgres process, or from a
* previous cycle of life in this same process. Zap it, if possible.
* This probably shouldn't fail, but if it does, assume the segment
* belongs to someone else after all, and continue quietly.
*/
shmdt(memAddress);
if (shmctl(shmid, IPC_RMID, NULL) < 0)
continue;

My first thought was to remember the control segment ID from the
header just before the shmdt() there, and then later when the DSM
module is starting, do cleanup. But that doesn't seem like a good
idea, because then if startup fails after we remove the old shared
memory segment and before we get to the DSM initialization code, we'll
have lost the information on what control segment needs to be cleaned
up. A subsequent startup attempt won't see the old shm again, because
it's already gone. I'm fairly sure that this would be a net reduction
in reliability vs. the status quo.

So now what I'm thinking is that we ought to actually perform the DSM
cleanup before detaching the old segment and trying to remove it.
That shouldn't fail, but if it does, or if we get killed before
completing it, the next run will hopefully find the same old shm and
finish the cleanup. But that kind of flies in the face of the comment
above: if we perform DSM cleanup and then discover that the segment
wasn't ours after all, that means we just stomped all over somebody
else's stuff. That's not too good. But trying to remove the segment
first and then perform the cleanup creates a window where, if we get
killed, the next restart will have lost information about how to
finish cleaning up. So it seems that some kind of logic tweak is
needed here, but I'm not sure exactly what. As I see it, the options
are:

1. Make failure to remove the shared memory segment we thought was
ours an error. This will definitely show up any problems, but only
after we've burned down some other processes's dynamic shared memory
segments. The most likely outcome is creating failure-to-start
problems that don't exist today.

2. Make it a warning. We'll still burn down somebody else's DSMs, but
at least we'll still start ourselves. Sadly, "WARNING: You have just
done a bad thing. It's too late to fix it. Sorry!" is not very
appealing.

3. Remove the old shared memory segment first, then perform the
cleanup immediately afterwards. If we get killed before completing
the cleanup, we'll leak the un-cleaned-up stuff. Decide that's OK and
move on.

4. Adopt some sort of belt-and-suspenders approach, keeping the state
file we have now and backstopping it with this mechanism, so that we
really only need this to work when $PGDATA has been blown away and
recreated. This seems pretty inelegant, and I'm not sure who it'll
benefit other than those (few?) people who kill -9 the postmaster (or
it segfaults or otherwise dies without running the code to remove
shared memory segments) and then remove $PGDATA and then re-initdb and
then expect cleanup to find the old DSMs.

5. Give up on this approach. We could keep what we have now, or make
the DSM control segment land at a well-known address as we do for the
main segment.

What do people prefer?

The other thing I'm a bit squidgy about is injecting more code that
runs before we get the new shared memory segment up and running.
During that time, we don't have effective mutual exclusion, so it's
possible that someone else could be trying to do cleanup at the same
time we are. That should be OK as far as the DSM code itself is
concerned, but there may be other downsides to lengthening the period
of time that's required to get mutual exclusion in place that I should
be worrying about.

I thought about trying to handle this by destroying the old main
shared memory segment (memorizing the handle) and then creating the
new one and immediately inserting the old handle into it. Then, the
DSM cleanup can happen much later, and if we die in the meanwhile, the
next startup will find our new main shared memory segment but it'll
still be pointing to the old control segment, so cleanup should still
work. But there's still a window between the time when we destroy the
old main shared memory segment and the time when we create the new one
where we leak if we die unexpectedly. In particular, when creating
the new segment fails, we're stuck. I don't really see a way to
salvage this approach.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#14)
Re: Dynamic Shared Memory stuff

On Wed, Dec 18, 2013 at 12:21:08PM -0500, Robert Haas wrote:

On Tue, Dec 10, 2013 at 6:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The larger point is that such a shutdown process has never in the history
of Postgres been successful at removing shared-memory (or semaphore)
resources. I do not feel a need to put a higher recoverability standard
onto the DSM code than what we've had for fifteen years with SysV shmem.

But, by the same token, it shouldn't be any *less* recoverable. In this
context that means that starting a new postmaster should recover the
resources, at least 90% of the time (100% if we still have the old
postmaster lockfile). I think the idea of keeping enough info in the
SysV segment to permit recovery of DSM resources is a good one. Then,
any case where the existing logic is able to free the old SysV segment
is guaranteed to also work for DSM.

So I'm taking a look at this. There doesn't appear to be anything
intrinsically intractable here, but it seems important to time the
order of operations so as to minimize the chances that things fail in
awkward places. The point where we remove the old shared memory
segment from the previous postmaster invocation is here:

/*
* The segment appears to be from a dead Postgres process, or from a
* previous cycle of life in this same process. Zap it, if possible.
* This probably shouldn't fail, but if it does, assume the segment
* belongs to someone else after all, and continue quietly.
*/
shmdt(memAddress);
if (shmctl(shmid, IPC_RMID, NULL) < 0)
continue;

My first thought was to remember the control segment ID from the
header just before the shmdt() there, and then later when the DSM
module is starting, do cleanup. But that doesn't seem like a good
idea, because then if startup fails after we remove the old shared
memory segment and before we get to the DSM initialization code, we'll
have lost the information on what control segment needs to be cleaned
up. A subsequent startup attempt won't see the old shm again, because
it's already gone. I'm fairly sure that this would be a net reduction
in reliability vs. the status quo.

So now what I'm thinking is that we ought to actually perform the DSM
cleanup before detaching the old segment and trying to remove it.
That shouldn't fail, but if it does, or if we get killed before
completing it, the next run will hopefully find the same old shm and
finish the cleanup. But that kind of flies in the face of the comment
above: if we perform DSM cleanup and then discover that the segment
wasn't ours after all, that means we just stomped all over somebody
else's stuff. That's not too good. But trying to remove the segment
first and then perform the cleanup creates a window where, if we get
killed, the next restart will have lost information about how to
finish cleaning up. So it seems that some kind of logic tweak is
needed here, but I'm not sure exactly what. As I see it, the options
are:

1. Make failure to remove the shared memory segment we thought was
ours an error. This will definitely show up any problems, but only
after we've burned down some other processes's dynamic shared memory
segments. The most likely outcome is creating failure-to-start
problems that don't exist today.

2. Make it a warning. We'll still burn down somebody else's DSMs, but
at least we'll still start ourselves. Sadly, "WARNING: You have just
done a bad thing. It's too late to fix it. Sorry!" is not very
appealing.

It has long been the responsibility of PGSharedMemoryCreate() to determine
that a segment is unimportant before calling IPC_RMID. The success or failure
of IPC_RMID is an unreliable guide to the correctness of that determination.
IPC_RMID will succeed on an important segment owned by the same UID, and it
will fail if some other process removed the segment after our shmat(). Such a
failure does not impugn our having requested DSM cleanup on the basis of the
PGShmemHeader we did read, so an apologetic WARNING would be wrong.

3. Remove the old shared memory segment first, then perform the
cleanup immediately afterwards. If we get killed before completing
the cleanup, we'll leak the un-cleaned-up stuff. Decide that's OK and
move on.

The period in which process exit would leak segments is narrow enough that
this seems fine. I see no net advantage over performing the cleanup before
shmdt(), though.

4. Adopt some sort of belt-and-suspenders approach, keeping the state
file we have now and backstopping it with this mechanism, so that we
really only need this to work when $PGDATA has been blown away and
recreated. This seems pretty inelegant, and I'm not sure who it'll
benefit other than those (few?) people who kill -9 the postmaster (or
it segfaults or otherwise dies without running the code to remove
shared memory segments) and then remove $PGDATA and then re-initdb and
then expect cleanup to find the old DSMs.

Independent of the choice we make here, I would keep the state file. POSIX
permits shm_open() segments to survive reboots, so the state file would come
into play after crashes on such systems. Also, folks who use "ipcrm" after a
"kill -9" of the postmaster would benefit from the state file. Nobody should
do that, but hey, catering to things nobody should do is what brought us here.

An option I had envisioned was to make PGSharedMemoryCreate() create a state
file based on the old sysv segment. Then the later dsm_postmaster_startup()
would proceed normally, and a restart after a failure between shmdt() and
dsm_postmaster_startup() would also do the cleanup. A wrinkle comes from the
fact that an existing state file could reference a different control segment;
this would happen if we picked a different sysv shm key compared to the last
postmaster start. In that case, we should presumably clean both control
segments. Performing the cleanup per #1/#2 achieves that.

5. Give up on this approach. We could keep what we have now, or make
the DSM control segment land at a well-known address as we do for the
main segment.

How would having the DSM control segment at a well-known address affect the
problem at hand? Did you mean a well-known dsm_handle?

What do people prefer?

I recommend performing cleanup on the control segment named in PGShmemHeader
just before shmdt() in PGSharedMemoryCreate(). No new ERROR or WARNING sites
are necessary. Have dsm_postmaster_startup() continue to perform a cleanup on
the control segment named in the state file.

The other thing I'm a bit squidgy about is injecting more code that
runs before we get the new shared memory segment up and running.
During that time, we don't have effective mutual exclusion, so it's
possible that someone else could be trying to do cleanup at the same
time we are. That should be OK as far as the DSM code itself is
concerned, but there may be other downsides to lengthening the period
of time that's required to get mutual exclusion in place that I should
be worrying about.

I see no general need to avoid small increases to that time period.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#15)
Re: Dynamic Shared Memory stuff

On Tue, Jan 21, 2014 at 2:58 PM, Noah Misch <noah@leadboat.com> wrote:

What do people prefer?

I recommend performing cleanup on the control segment named in PGShmemHeader
just before shmdt() in PGSharedMemoryCreate(). No new ERROR or WARNING sites
are necessary. Have dsm_postmaster_startup() continue to perform a cleanup on
the control segment named in the state file.

I think I'm on board with the first two sentences of this, but after
Fujii Masao's email yesterday, I can't help thinking that what you
propose the third sentence is a bad idea. He cloned a master to
create a standby server on the same machine, and the standby startup
ate the master's dynamic shared memory. We could teach pg_basebackup
not to copy the state file, but that wouldn't help people who take
base backups using the file system copy method, which is a lot of
people.

5. Give up on this approach. We could keep what we have now, or make
the DSM control segment land at a well-known address as we do for the
main segment.

How would having the DSM control segment at a well-known address affect the
problem at hand? Did you mean a well-known dsm_handle?

Yeah. So the idea is that we'd always use a dsm_handle of 1000000 +
(100 * port) or something like that, and then search forward until we
find a dsm_handle that works. This is basically the same algorithm
we're using today for the main shared memory segment, but with a large
additive constant so that they don't collide with each other.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#16)
Re: Dynamic Shared Memory stuff

On Wed, Jan 22, 2014 at 09:32:09AM -0500, Robert Haas wrote:

On Tue, Jan 21, 2014 at 2:58 PM, Noah Misch <noah@leadboat.com> wrote:

What do people prefer?

I recommend performing cleanup on the control segment named in PGShmemHeader
just before shmdt() in PGSharedMemoryCreate(). No new ERROR or WARNING sites
are necessary. Have dsm_postmaster_startup() continue to perform a cleanup on
the control segment named in the state file.

I think I'm on board with the first two sentences of this, but after
Fujii Masao's email yesterday, I can't help thinking that what you
propose the third sentence is a bad idea. He cloned a master to
create a standby server on the same machine, and the standby startup
ate the master's dynamic shared memory. We could teach pg_basebackup
not to copy the state file, but that wouldn't help people who take
base backups using the file system copy method, which is a lot of
people.

I agree we should not rely on folks learning to omit the state file from base
backups. Abandoning the state file is one way to resolve that, and the
reasons I outlined for preferring to keep it were not overriding concerns. We
could instead store a postmaster PID in dsm_control_header and only clean if
that PID is dead. We could make DSM startup aware of whether we're using a
backup label, but that would be awkward thanks to StartupXLOG() happening a
good bit later. Yeah, abandoning the state file is looking attractive.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#17)
1 attachment(s)
Re: Dynamic Shared Memory stuff

On Wed, Jan 22, 2014 at 10:17 AM, Noah Misch <noah@leadboat.com> wrote:

Yeah, abandoning the state file is looking attractive.

Here's a draft patch getting rid of the state file. This should
address concerns raised by Heikki and Fujii Masao and echoed by Tom
that dynamic shared memory behaves differently than the main shared
memory segment. The control segment ID is stored in the System V
shared memory block, so we're guaranteed that when using either System
V or POSIX shared memory we'll always latch onto the control segment
that matches up with the main shared memory segment we latched on to.
Cleanup for the file-mmap and Windows methods doesn't need to change,
because the former always just means clearing out $PGDATA/pg_dynshmem,
and the latter is done automatically by the operating system.

Comments?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

no-dsm-state-file-v1.patchtext/x-patch; charset=US-ASCII; name=no-dsm-state-file-v1.patchDownload
commit 6095f077a0d9a45e422087be1c7e8791c247a4e9
Author: Robert Haas <rhaas@postgresql.org>
Date:   Fri Apr 4 09:06:23 2014 -0400

    Get rid of the dynamic shared memory state file.
    
    Instead, store the control segment ID in the header for the main
    shared memory segment, and recover it from there when we restart after
    a crash.  This ensures that we never latch onto the main shared memory
    segment from one previous run and the dynamic shared memory control
    segment from a different one.  It also avoids problems if you take a
    base backup and start it up as hot standby on the same machine while
    the previous postmaster is still running.

diff --git a/src/backend/port/ipc_test.c b/src/backend/port/ipc_test.c
index 6259919..936c1a4 100644
--- a/src/backend/port/ipc_test.c
+++ b/src/backend/port/ipc_test.c
@@ -30,6 +30,7 @@
 #include <unistd.h>
 
 #include "miscadmin.h"
+#include "storage/dsm.h"
 #include "storage/ipc.h"
 #include "storage/pg_sema.h"
 #include "storage/pg_shmem.h"
@@ -214,12 +215,13 @@ int
 main(int argc, char **argv)
 {
 	MyStorage  *storage;
+	PGShmemHeader *shim;
 	int			cpid;
 
 	printf("Creating shared memory ... ");
 	fflush(stdout);
 
-	storage = (MyStorage *) PGSharedMemoryCreate(8192, false, 5433);
+	storage = (MyStorage *) PGSharedMemoryCreate(8192, false, 5433, &shim);
 
 	storage->flag = 1234;
 
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 51c1a2b..5e3850b 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -30,6 +30,7 @@
 
 #include "miscadmin.h"
 #include "portability/mem.h"
+#include "storage/dsm.h"
 #include "storage/ipc.h"
 #include "storage/pg_shmem.h"
 #include "utils/guc.h"
@@ -421,7 +422,8 @@ CreateAnonymousSegment(Size *size)
  * zero will be passed.
  */
 PGShmemHeader *
-PGSharedMemoryCreate(Size size, bool makePrivate, int port)
+PGSharedMemoryCreate(Size size, bool makePrivate, int port,
+					 PGShmemHeader **shim)
 {
 	IpcMemoryKey NextShmemSegID;
 	void	   *memAddress;
@@ -509,10 +511,13 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 
 		/*
 		 * The segment appears to be from a dead Postgres process, or from a
-		 * previous cycle of life in this same process.  Zap it, if possible.
+		 * previous cycle of life in this same process.  Zap it, if possible,
+		 * and any associated dynamic shared memory segments, as well.
 		 * This probably shouldn't fail, but if it does, assume the segment
 		 * belongs to someone else after all, and continue quietly.
 		 */
+		if (hdr->dsm_control != 0)
+			dsm_cleanup_using_control_segment(hdr->dsm_control);
 		shmdt(memAddress);
 		if (shmctl(shmid, IPC_RMID, NULL) < 0)
 			continue;
@@ -539,6 +544,7 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 	hdr = (PGShmemHeader *) memAddress;
 	hdr->creatorPID = getpid();
 	hdr->magic = PGShmemMagic;
+	hdr->dsm_control = 0;
 
 	/* Fill in the data directory ID info, too */
 	if (stat(DataDir, &statbuf) < 0)
@@ -554,6 +560,7 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 	 */
 	hdr->totalsize = size;
 	hdr->freeoffset = MAXALIGN(sizeof(PGShmemHeader));
+	*shim = hdr;
 
 	/* Save info for possible future use */
 	UsedShmemSegAddr = memAddress;
@@ -608,6 +615,7 @@ PGSharedMemoryReAttach(void)
 	if (hdr != origUsedShmemSegAddr)
 		elog(FATAL, "reattaching to shared memory returned unexpected address (got %p, expected %p)",
 			 hdr, origUsedShmemSegAddr);
+	dsm_set_control_handle(((PGShmemHeader *) hdr)->dsm_control);
 
 	UsedShmemSegAddr = hdr;		/* probably redundant */
 }
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index dca371c..3a0ded4 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -117,7 +117,8 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2)
  *
  */
 PGShmemHeader *
-PGSharedMemoryCreate(Size size, bool makePrivate, int port)
+PGSharedMemoryCreate(Size size, bool makePrivate, int port,
+					 PGShmemHeader **shim)
 {
 	void	   *memAddress;
 	PGShmemHeader *hdr;
@@ -245,12 +246,14 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 	 */
 	hdr->totalsize = size;
 	hdr->freeoffset = MAXALIGN(sizeof(PGShmemHeader));
+	hdr->dsm_control = 0;
 
 	/* Save info for possible future use */
 	UsedShmemSegAddr = memAddress;
 	UsedShmemSegSize = size;
 	UsedShmemSegID = hmap2;
 
+	*shim = NULL;
 	return hdr;
 }
 
@@ -289,6 +292,7 @@ PGSharedMemoryReAttach(void)
 			 hdr, origUsedShmemSegAddr);
 	if (hdr->magic != PGShmemMagic)
 		elog(FATAL, "reattaching to shared memory returned non-PostgreSQL memory");
+	dsm_set_control_handle(hdr->dsm_control);
 
 	UsedShmemSegAddr = hdr;		/* probably redundant */
 }
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index c967177..6c410f7 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -39,13 +39,11 @@
 #include "storage/dsm.h"
 #include "storage/ipc.h"
 #include "storage/lwlock.h"
+#include "storage/pg_shmem.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/resowner_private.h"
 
-#define PG_DYNSHMEM_STATE_FILE			PG_DYNSHMEM_DIR "/state"
-#define PG_DYNSHMEM_NEW_STATE_FILE		PG_DYNSHMEM_DIR "/state.new"
-#define PG_DYNSHMEM_STATE_BUFSIZ		512
 #define PG_DYNSHMEM_CONTROL_MAGIC		0x9a503d32
 
 /*
@@ -95,10 +93,7 @@ typedef struct dsm_control_header
 	dsm_control_item	item[FLEXIBLE_ARRAY_MEMBER];
 } dsm_control_header;
 
-static void dsm_cleanup_using_control_segment(void);
 static void dsm_cleanup_for_mmap(void);
-static bool dsm_read_state_file(dsm_handle *h);
-static void dsm_write_state_file(dsm_handle h);
 static void dsm_postmaster_shutdown(int code, Datum arg);
 static dsm_segment *dsm_create_descriptor(void);
 static bool dsm_control_segment_sane(dsm_control_header *control,
@@ -146,7 +141,7 @@ static void	*dsm_control_impl_private = NULL;
  * startup time.
  */
 void
-dsm_postmaster_startup(void)
+dsm_postmaster_startup(PGShmemHeader *shim)
 {
 	void	   *dsm_control_address = NULL;
 	uint32		maxitems;
@@ -159,26 +154,13 @@ dsm_postmaster_startup(void)
 		return;
 
 	/*
-	 * Check for, and remove, shared memory segments left behind by a dead
-	 * postmaster.  This isn't necessary on Windows, which always removes them
-	 * when the last reference is gone.
+	 * If we're using the mmap implementations, clean up any leftovers.
+	 * Cleanup isn't needed on Windows, and happens earlier in startup for
+	 * POSIX and System V shared memory, via a direct call to
+	 * dsm_cleanup_using_control_segment.
 	 */
-	switch (dynamic_shared_memory_type)
-	{
-		case DSM_IMPL_POSIX:
-		case DSM_IMPL_SYSV:
-			dsm_cleanup_using_control_segment();
-			break;
-		case DSM_IMPL_MMAP:
-			dsm_cleanup_for_mmap();
-			break;
-		case DSM_IMPL_WINDOWS:
-			/* Nothing to do. */
-			break;
-		default:
-			elog(ERROR, "unknown dynamic shared memory type: %d",
-				 dynamic_shared_memory_type);
-	}
+	if (dynamic_shared_memory_type == DSM_IMPL_MMAP)
+		dsm_cleanup_for_mmap();
 
 	/* Determine size for new control segment. */
 	maxitems = PG_DYNSHMEM_FIXED_SLOTS
@@ -187,23 +169,30 @@ dsm_postmaster_startup(void)
 		maxitems);
 	segsize = dsm_control_bytes_needed(maxitems);
 
-	/* Loop until we find an unused identifier for the new control segment. */
+	/*
+	 * Loop until we find an unused identifier for the new control segment.
+	 * We sometimes use 0 as a sentinel value indicating that no control
+	 * segment is known to exist, so avoid using that value for a real
+	 * control segment.
+	 */
 	for (;;)
 	{
 		Assert(dsm_control_address == NULL);
 		Assert(dsm_control_mapped_size == 0);
 		dsm_control_handle = random();
+		if (dsm_control_handle == 0)
+			continue;
 		if (dsm_impl_op(DSM_OP_CREATE, dsm_control_handle, segsize,
 						&dsm_control_impl_private, &dsm_control_address,
 						&dsm_control_mapped_size, ERROR))
 			break;
 	}
 	dsm_control = dsm_control_address;
-	on_shmem_exit(dsm_postmaster_shutdown, 0);
+	on_shmem_exit(dsm_postmaster_shutdown, PointerGetDatum(shim));
 	elog(DEBUG2,
 		 "created dynamic shared memory control segment %u (%zu bytes)",
 		 dsm_control_handle, segsize);
-	dsm_write_state_file(dsm_control_handle);
+	shim->dsm_control = dsm_control_handle;
 
 	/* Initialize control segment. */
 	dsm_control->magic = PG_DYNSHMEM_CONTROL_MAGIC;
@@ -216,8 +205,8 @@ dsm_postmaster_startup(void)
  * invocation still exists.  If so, remove the dynamic shared memory
  * segments to which it refers, and then the control segment itself.
  */
-static void
-dsm_cleanup_using_control_segment(void)
+void
+dsm_cleanup_using_control_segment(dsm_handle old_control_handle)
 {
 	void	   *mapped_address = NULL;
 	void	   *junk_mapped_address = NULL;
@@ -227,14 +216,10 @@ dsm_cleanup_using_control_segment(void)
 	Size		junk_mapped_size = 0;
 	uint32		nitems;
 	uint32		i;
-	dsm_handle	old_control_handle;
 	dsm_control_header *old_control;
 
-	/*
-	 * Read the state file.  If it doesn't exist or is empty, there's nothing
-	 * more to do.
-	 */
-	if (!dsm_read_state_file(&old_control_handle))
+	/* If dynamic shared memory is disabled, there's nothing to do. */
+	if (dynamic_shared_memory_type == DSM_IMPL_NONE)
 		return;
 
 	/*
@@ -347,111 +332,6 @@ dsm_cleanup_for_mmap(void)
 }
 
 /*
- * Read and parse the state file.
- *
- * If the state file is empty or the contents are garbled, it probably means
- * that the operating system rebooted before the data written by the previous
- * postmaster made it to disk.  In that case, we can just ignore it; any shared
- * memory from before the reboot should be gone anyway.
- */
-static bool
-dsm_read_state_file(dsm_handle *h)
-{
-	int			statefd;
-	char		statebuf[PG_DYNSHMEM_STATE_BUFSIZ];
-	int			nbytes = 0;
-	char	   *endptr,
-			   *s;
-	dsm_handle	handle;
-
-	/* Read the state file to get the ID of the old control segment. */
-	statefd = BasicOpenFile(PG_DYNSHMEM_STATE_FILE, O_RDONLY | PG_BINARY, 0);
-	if (statefd < 0)
-	{
-		if (errno == ENOENT)
-			return false;
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("could not open file \"%s\": %m",
-					PG_DYNSHMEM_STATE_FILE)));
-	}
-	nbytes = read(statefd, statebuf, PG_DYNSHMEM_STATE_BUFSIZ - 1);
-	if (nbytes < 0)
-	{
-		close(statefd);
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("could not read file \"%s\": %m",
-					PG_DYNSHMEM_STATE_FILE)));
-	}
-	/* make sure buffer is NUL terminated */
-	statebuf[nbytes] = '\0';
-	close(statefd);
-
-	/*
-	 * We expect to find the handle of the old control segment here,
-	 * on a line by itself.
-	 */
-	handle = strtoul(statebuf, &endptr, 10);
-	for (s = endptr; *s == ' ' || *s == '\t'; ++s)
-		;
-	if (*s != '\n' && *s != '\0')
-		return false;
-
-	/* Looks good. */
-	*h = handle;
-	return true;
-}
-
-/*
- * Write our control segment handle to the state file, so that if the
- * postmaster is killed without running it's on_shmem_exit hooks, the
- * next postmaster can clean things up after restart.
- */
-static void
-dsm_write_state_file(dsm_handle h)
-{
-	int			statefd;
-	char		statebuf[PG_DYNSHMEM_STATE_BUFSIZ];
-	int			nbytes;
-
-	/* Create or truncate the file. */
-	statefd = open(PG_DYNSHMEM_NEW_STATE_FILE,
-				   O_RDWR | O_CREAT | O_TRUNC | PG_BINARY, 0600);
-	if (statefd < 0)
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("could not create file \"%s\": %m",
-					PG_DYNSHMEM_NEW_STATE_FILE)));
-
-	/* Write contents. */
-	snprintf(statebuf, PG_DYNSHMEM_STATE_BUFSIZ, "%u\n", dsm_control_handle);
-	nbytes = strlen(statebuf);
-	if (write(statefd, statebuf, nbytes) != nbytes)
-	{
-		if (errno == 0)
-			errno = ENOSPC;		/* if no error signalled, assume no space */
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("could not write file \"%s\": %m",
-					PG_DYNSHMEM_NEW_STATE_FILE)));
-	}
-
-	/* Close file. */
-	close(statefd);
-
-	/*
-	 * Atomically rename file into place, so that no one ever sees a partially
-	 * written state file.
-	 */
-	if (rename(PG_DYNSHMEM_NEW_STATE_FILE, PG_DYNSHMEM_STATE_FILE) < 0)
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("could not rename file \"%s\": %m",
-					PG_DYNSHMEM_NEW_STATE_FILE)));
-}
-
-/*
  * At shutdown time, we iterate over the control segment and remove all
  * remaining dynamic shared memory segments.  We avoid throwing errors here;
  * the postmaster is shutting down either way, and this is just non-critical
@@ -466,6 +346,7 @@ dsm_postmaster_shutdown(int code, Datum arg)
 	void	   *junk_mapped_address = NULL;
 	void	   *junk_impl_private = NULL;
 	Size		junk_mapped_size = 0;
+	PGShmemHeader *shim = (PGShmemHeader *) DatumGetPointer(arg);
 
 	/*
 	 * If some other backend exited uncleanly, it might have corrupted the
@@ -510,13 +391,7 @@ dsm_postmaster_shutdown(int code, Datum arg)
 				&dsm_control_impl_private, &dsm_control_address,
 				&dsm_control_mapped_size, LOG);
 	dsm_control = dsm_control_address;
-
-	/* And, finally, remove the state file. */
-	if (unlink(PG_DYNSHMEM_STATE_FILE) < 0)
-		ereport(LOG,
-				(errcode_for_file_access(),
-				 errmsg("could not unlink file \"%s\": %m",
-					PG_DYNSHMEM_STATE_FILE)));
+	shim->dsm_control = 0;
 }
 
 /*
@@ -536,25 +411,18 @@ dsm_backend_startup(void)
 
 #ifdef EXEC_BACKEND
 	{
-		dsm_handle	control_handle;
 		void	   *control_address = NULL;
 
-		/* Read the control segment information from the state file. */
-		if (!dsm_read_state_file(&control_handle))
-			ereport(ERROR,
-					(errcode(ERRCODE_INTERNAL_ERROR),
-					 errmsg("could not parse dynamic shared memory state file")));
-
 		/* Attach control segment. */
-		dsm_impl_op(DSM_OP_ATTACH, control_handle, 0,
+		Assert(dsm_control_handle != 0);
+		dsm_impl_op(DSM_OP_ATTACH, dsm_control_handle, 0,
 					&dsm_control_impl_private, &control_address,
 					&dsm_control_mapped_size, ERROR);
-		dsm_control_handle = control_handle;
 		dsm_control = control_address;
 		/* If control segment doesn't look sane, something is badly wrong. */
 		if (!dsm_control_segment_sane(dsm_control, dsm_control_mapped_size))
 		{
-			dsm_impl_op(DSM_OP_DETACH, control_handle, 0,
+			dsm_impl_op(DSM_OP_DETACH, dsm_control_handle, 0,
 						&dsm_control_impl_private, &control_address,
 						&dsm_control_mapped_size, WARNING);
 			ereport(FATAL,
@@ -567,6 +435,20 @@ dsm_backend_startup(void)
 	dsm_init_done = true;
 }
 
+#ifdef EXEC_BACKEND
+/*
+ * When running under EXEC_BACKEND, we get a callback here when the main
+ * shared memory segment is re-attached, so that we can record the control
+ * handle retrieved from it.
+ */
+void
+dsm_set_control_handle(dsm_handle h)
+{
+	Assert(dsm_control_handle == 0 && h != 0);
+	dsm_control_handle = h;
+}
+#endif
+
 /*
  * Create a new dynamic shared memory segment.
  */
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index c392d4f..4290d2d 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -90,6 +90,8 @@ RequestAddinShmemSpace(Size size)
 void
 CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
 {
+	PGShmemHeader *shim = NULL;
+
 	if (!IsUnderPostmaster)
 	{
 		PGShmemHeader *seghdr;
@@ -149,7 +151,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
 		/*
 		 * Create the shmem segment
 		 */
-		seghdr = PGSharedMemoryCreate(size, makePrivate, port);
+		seghdr = PGSharedMemoryCreate(size, makePrivate, port, &shim);
 
 		InitShmemAccess(seghdr);
 
@@ -254,7 +256,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
 
 	/* Initialize dynamic shared memory facilities. */
 	if (!IsUnderPostmaster)
-		dsm_postmaster_startup();
+		dsm_postmaster_startup(shim);
 
 	/*
 	 * Now give loadable modules a chance to set up their shmem allocations
diff --git a/src/include/storage/dsm.h b/src/include/storage/dsm.h
index e4669a1..272787a 100644
--- a/src/include/storage/dsm.h
+++ b/src/include/storage/dsm.h
@@ -18,10 +18,16 @@
 typedef struct dsm_segment dsm_segment;
 
 /* Startup and shutdown functions. */
-extern void dsm_postmaster_startup(void);
+struct PGShmemHeader;		/* avoid including pg_shmem.h */
+extern void dsm_cleanup_using_control_segment(dsm_handle old_control_handle);
+extern void dsm_postmaster_startup(struct PGShmemHeader *);
 extern void dsm_backend_shutdown(void);
 extern void dsm_detach_all(void);
 
+#ifdef EXEC_BACKEND
+extern void dsm_set_control_handle(dsm_handle h);
+#endif
+
 /* Functions that create, update, or remove mappings. */
 extern dsm_segment *dsm_create(Size size);
 extern dsm_segment *dsm_attach(dsm_handle h);
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 0dc960b..ab28ebe 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -24,6 +24,8 @@
 #ifndef PG_SHMEM_H
 #define PG_SHMEM_H
 
+#include "storage/dsm_impl.h"
+
 typedef struct PGShmemHeader	/* standard header for all Postgres shmem */
 {
 	int32		magic;			/* magic # to identify Postgres segments */
@@ -31,6 +33,7 @@ typedef struct PGShmemHeader	/* standard header for all Postgres shmem */
 	pid_t		creatorPID;		/* PID of creating process */
 	Size		totalsize;		/* total size of segment */
 	Size		freeoffset;		/* offset to first free space */
+	dsm_handle	dsm_control;	/* ID of dynamic shared memory control seg */
 	void	   *index;			/* pointer to ShmemIndex table */
 #ifndef WIN32					/* Windows doesn't have useful inode#s */
 	dev_t		device;			/* device data directory is on */
@@ -61,7 +64,7 @@ extern void PGSharedMemoryReAttach(void);
 #endif
 
 extern PGShmemHeader *PGSharedMemoryCreate(Size size, bool makePrivate,
-					 int port);
+					 int port, PGShmemHeader **shim);
 extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
 extern void PGSharedMemoryDetach(void);
 
#19Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#18)
Re: Dynamic Shared Memory stuff

On Fri, Apr 4, 2014 at 10:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jan 22, 2014 at 10:17 AM, Noah Misch <noah@leadboat.com> wrote:

Yeah, abandoning the state file is looking attractive.

Here's a draft patch getting rid of the state file. This should
address concerns raised by Heikki and Fujii Masao and echoed by Tom
that dynamic shared memory behaves differently than the main shared
memory segment. The control segment ID is stored in the System V
shared memory block, so we're guaranteed that when using either System
V or POSIX shared memory we'll always latch onto the control segment
that matches up with the main shared memory segment we latched on to.
Cleanup for the file-mmap and Windows methods doesn't need to change,
because the former always just means clearing out $PGDATA/pg_dynshmem,
and the latter is done automatically by the operating system.

Comments?

Apparently not. However, I'm fairly sure this is a step toward
addressing the complaints previously raised, even if there may be some
details people still want changed, so I've gone ahead and committed
it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#19)
1 attachment(s)
Re: Dynamic Shared Memory stuff

On Tue, Apr 8, 2014 at 9:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Apparently not. However, I'm fairly sure this is a step toward
addressing the complaints previously raised, even if there may be some
details people still want changed, so I've gone ahead and committed
it.

Few Observations:

1. One new warning has been introduced in code.
1>src\backend\port\win32_shmem.c(295): warning C4013:
'dsm_set_control_handle' undefined; assuming extern returning int
Attached patch fixes this warning.

2. On running test_shm_mq manually, the client side didn't get
any problem (the results are as expected). However I am seeing
below log on server:
LOG: registering background worker "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: unrecognized win32 error code: 127
LOG: worker process: test_shm_mq (PID 2528) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"

I think below message in log is not expected:
"LOG: unrecognized win32 error code: 127"

This has been started appearing after your commit related to DSM.
I have debugged this issue and found that it comes from below part
of code:
dsm_impl_windows
{
..
else
{
hmap = OpenFileMapping(FILE_MAP_WRITE | FILE_MAP_READ,
FALSE, /* do not inherit the name */
name); /* name of mapping object */
_dosmaperr(GetLastError());
}
}

Here even though handle returned by OpenFileMapping() is a valid handle,
but still GetLastError() return 127 (The specified procedure could not be
found.). Now the specs[1]http://msdn.microsoft.com/en-us/library/windows/desktop/aa366791(v=vs.85).aspx of API says that if handle is non-NULL, then
consider it success, so I am not sure whether we should bother about this
error or not. I have tried many ways (trying different parameters, search on
net) to change this and related API's, but the same problem exists. One
strange thing is if I just call the API twice (I know this is not
right, but just
to experiment for finding some info), second time this error doesn't
occur.
The only difference in latest changes is that now the OpenFileMapping()
is getting called by Child process rather than peer process (w.r.t process
that has called CreateFileMapping), don't know why this should make
such a difference.

On net whatever examples I have seen for such usage, they call
GetLastError() only if handle is invalid, we have called in above fashion
just to keep code little bit simple.

I am just not sure whether it is okay to rearrange the code and call
GetLastError() only if returned handle is Invalid (NULL) or try to look
for more info.

One question:
1. I have seen that initdb still creates pg_dynshmem, is it required
after your latest changes?

Let me know your opinion?

[1]: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366791(v=vs.85).aspx
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366791(v=vs.85).aspx

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_dsm_win_warning.patchapplication/octet-stream; name=fix_dsm_win_warning.patchDownload
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index a537bb3..29ea6d8 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -15,6 +15,7 @@
 #include "miscadmin.h"
 #include "storage/ipc.h"
 #include "storage/pg_shmem.h"
+#include "storage/dsm.h"
 
 HANDLE		UsedShmemSegID = 0;
 void	   *UsedShmemSegAddr = NULL;
#21Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#20)
Re: Dynamic Shared Memory stuff

On Wed, Apr 9, 2014 at 7:41 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Few Observations:

1. One new warning has been introduced in code.
1>src\backend\port\win32_shmem.c(295): warning C4013:
'dsm_set_control_handle' undefined; assuming extern returning int
Attached patch fixes this warning.

OK, committed, after moving the header to the correct position in
alphabetical order.

2. On running test_shm_mq manually, the client side didn't get
any problem (the results are as expected). However I am seeing
below log on server:
LOG: registering background worker "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: unrecognized win32 error code: 127
LOG: worker process: test_shm_mq (PID 2528) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"

I think below message in log is not expected:
"LOG: unrecognized win32 error code: 127"

This has been started appearing after your commit related to DSM.
I have debugged this issue and found that it comes from below part
of code:
dsm_impl_windows
{
..
else
{
hmap = OpenFileMapping(FILE_MAP_WRITE | FILE_MAP_READ,
FALSE, /* do not inherit the name */
name); /* name of mapping object */
_dosmaperr(GetLastError());
}
}

Here even though handle returned by OpenFileMapping() is a valid handle,
but still GetLastError() return 127 (The specified procedure could not be
found.). Now the specs[1] of API says that if handle is non-NULL, then
consider it success, so I am not sure whether we should bother about this
error or not. I have tried many ways (trying different parameters, search on
net) to change this and related API's, but the same problem exists. One
strange thing is if I just call the API twice (I know this is not
right, but just
to experiment for finding some info), second time this error doesn't
occur.
The only difference in latest changes is that now the OpenFileMapping()
is getting called by Child process rather than peer process (w.r.t process
that has called CreateFileMapping), don't know why this should make
such a difference.

On net whatever examples I have seen for such usage, they call
GetLastError() only if handle is invalid, we have called in above fashion
just to keep code little bit simple.

I am just not sure whether it is okay to rearrange the code and call
GetLastError() only if returned handle is Invalid (NULL) or try to look
for more info.

Well, I don't know either. You wrote the Windows portion of this
code, so you'll have to decide what's best. If the best practice in
this area is to only call GetLastError() if the handle isn't valid,
then that's probably what we should do, too. But I can't speak to
what the best practice is.

One question:
1. I have seen that initdb still creates pg_dynshmem, is it required
after your latest changes?

It's only used now if dynamic_shared_memory_type = mmap. I know
Andres was never a huge fan of the mmap implementation, so we could
rip that out and get rid of the directory, too, but I kind of liked
having it, and broke the tie in favor of myself.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Andres Freund
andres@2ndquadrant.com
In reply to: Robert Haas (#21)
Re: Dynamic Shared Memory stuff

On 2014-04-09 11:50:33 -0400, Robert Haas wrote:

One question:
1. I have seen that initdb still creates pg_dynshmem, is it required
after your latest changes?

It's only used now if dynamic_shared_memory_type = mmap. I know
Andres was never a huge fan of the mmap implementation, so we could
rip that out and get rid of the directory, too, but I kind of liked
having it, and broke the tie in favor of myself.

It's purely a toy thing. I think pretty much every dynshm user that
actually transfers data through it will be better off falling back to
single process style work, rather than parellizing it.

Anyway, it's there, it doesn't presently cause problems, and I don't
have to maintain it, so ...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#21)
1 attachment(s)
Re: Dynamic Shared Memory stuff

On Wed, Apr 9, 2014 at 9:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Apr 9, 2014 at 7:41 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I am just not sure whether it is okay to rearrange the code and call
GetLastError() only if returned handle is Invalid (NULL) or try to look
for more info.

Well, I don't know either. You wrote the Windows portion of this
code, so you'll have to decide what's best. If the best practice in
this area is to only call GetLastError() if the handle isn't valid,
then that's probably what we should do, too. But I can't speak to
what the best practice is.

I have checked that other place in code also check handle to
decide if API has failed. Refer function PGSharedMemoryIsInUse().
So I think fix to call GetLastError() after checking handle is okay.
Attached patch fixes this issue. After patch, the server shows below
log which is exactly what is expected from test_shm_mq

LOG: registering background worker "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: worker process: test_shm_mq (PID 4888) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"
LOG: registering background worker "test_shm_mq"
LOG: starting background worker process "test_shm_mq"
LOG: worker process: test_shm_mq (PID 3128) exited with exit code 1
LOG: unregistering background worker "test_shm_mq"

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_dsm_invalid_errcode_issue.patchapplication/octet-stream; name=fix_dsm_invalid_errcode_issue.patchDownload
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index b1519c9..5424787 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -705,22 +705,29 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 			CloseHandle(hmap);
 			return false;
 		}
+		if (!hmap)
+		{
+			ereport(elevel,
+					(errcode_for_dynamic_shared_memory(),
+					 errmsg("could not create shared memory segment \"%s\": %m",
+							name)));
+			return false;
+		}
 	}
 	else
 	{
 		hmap = OpenFileMapping(FILE_MAP_WRITE | FILE_MAP_READ,
 							   FALSE,		/* do not inherit the name */
 							   name);		/* name of mapping object */
-		_dosmaperr(GetLastError());
-	}
-
-	if (!hmap)
-	{
-		ereport(elevel,
-				(errcode_for_dynamic_shared_memory(),
-				 errmsg("could not open shared memory segment \"%s\": %m",
-					name)));
-		return false;
+		if (!hmap)
+		{
+			_dosmaperr(GetLastError());
+			ereport(elevel,
+					(errcode_for_dynamic_shared_memory(),
+					 errmsg("could not open shared memory segment \"%s\": %m",
+							name)));
+			return false;
+		}
 	}
 
 	/* Map it. */
#24Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#23)
Re: Dynamic Shared Memory stuff

On Sat, Apr 12, 2014 at 1:32 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Apr 9, 2014 at 9:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Apr 9, 2014 at 7:41 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I am just not sure whether it is okay to rearrange the code and call
GetLastError() only if returned handle is Invalid (NULL) or try to look
for more info.

Well, I don't know either. You wrote the Windows portion of this
code, so you'll have to decide what's best. If the best practice in
this area is to only call GetLastError() if the handle isn't valid,
then that's probably what we should do, too. But I can't speak to
what the best practice is.

I have checked that other place in code also check handle to
decide if API has failed. Refer function PGSharedMemoryIsInUse().
So I think fix to call GetLastError() after checking handle is okay.
Attached patch fixes this issue. After patch, the server shows below
log which is exactly what is expected from test_shm_mq

In PostgreSQL code, hmap == NULL, rather than !hmap, is the preferred
way to test for a NULL pointer. I notice that the !hmap style is used
throughout this code, so I guess cleaning that up is a matter for a
separate commit.

For the create case, I'm wondering if we should put the block that
tests for !hmap *before* the _dosmaperr() and check for EEXIST. What
is your opinion?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#24)
Re: Dynamic Shared Memory stuff

On Mon, Apr 14, 2014 at 10:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Apr 12, 2014 at 1:32 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I have checked that other place in code also check handle to
decide if API has failed. Refer function PGSharedMemoryIsInUse().
So I think fix to call GetLastError() after checking handle is okay.
Attached patch fixes this issue. After patch, the server shows below
log which is exactly what is expected from test_shm_mq

In PostgreSQL code, hmap == NULL, rather than !hmap, is the preferred
way to test for a NULL pointer. I notice that the !hmap style is used
throughout this code, so I guess cleaning that up is a matter for a
separate commit.

I think in that case we might want to cleanup some other similar usage
(PGSharedMemoryCreate) of !hmap.

For the create case, I'm wondering if we should put the block that
tests for !hmap *before* the _dosmaperr() and check for EEXIST. What
is your opinion?

Either way is okay, but I think the way you are suggesting is better as it
will make code consistent with other place (PGSharedMemoryCreate()).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#25)
Re: Dynamic Shared Memory stuff

On Tue, Apr 15, 2014 at 12:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 14, 2014 at 10:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Apr 12, 2014 at 1:32 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I have checked that other place in code also check handle to
decide if API has failed. Refer function PGSharedMemoryIsInUse().
So I think fix to call GetLastError() after checking handle is okay.
Attached patch fixes this issue. After patch, the server shows below
log which is exactly what is expected from test_shm_mq

In PostgreSQL code, hmap == NULL, rather than !hmap, is the preferred
way to test for a NULL pointer. I notice that the !hmap style is used
throughout this code, so I guess cleaning that up is a matter for a
separate commit.

I think in that case we might want to cleanup some other similar usage
(PGSharedMemoryCreate) of !hmap.

Ah. Well, in that case maybe we should just leave it alone, since
it's been like that forever and nobody's cared until now.

For the create case, I'm wondering if we should put the block that
tests for !hmap *before* the _dosmaperr() and check for EEXIST. What
is your opinion?

Either way is okay, but I think the way you are suggesting is better as it
will make code consistent with other place (PGSharedMemoryCreate()).

OK, can you prepare a patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#26)
1 attachment(s)
Re: Dynamic Shared Memory stuff

On Wed, Apr 16, 2014 at 3:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 15, 2014 at 12:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 14, 2014 at 10:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:

For the create case, I'm wondering if we should put the block that
tests for !hmap *before* the _dosmaperr() and check for EEXIST. What
is your opinion?

Either way is okay, but I think the way you are suggesting is better as it
will make code consistent with other place (PGSharedMemoryCreate()).

OK, can you prepare a patch?

Please find attached patch to address this issue.
One minor point to note is that now we have to call GetLastError() twice,
once inside error path and once to check EEXIST, but I think that is okay
as existing code in PGSharedMemoryCreate() does it that way.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_dsm_invalid_errcode_issue-v2.patchapplication/octet-stream; name=fix_dsm_invalid_errcode_issue-v2.patchDownload
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index b1519c9..fa253f0 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -693,6 +693,15 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 								 size_high, 	/* Upper 32 bits of size */
 								 size_low,		/* Lower 32 bits of size */
 								 name);
+		if (!hmap)
+		{
+			_dosmaperr(GetLastError());
+			ereport(elevel,
+					(errcode_for_dynamic_shared_memory(),
+					 errmsg("could not create shared memory segment \"%s\": %m",
+							name)));
+			return false;
+		}
 		_dosmaperr(GetLastError());
 		if (errno == EEXIST)
 		{
@@ -711,16 +720,15 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		hmap = OpenFileMapping(FILE_MAP_WRITE | FILE_MAP_READ,
 							   FALSE,		/* do not inherit the name */
 							   name);		/* name of mapping object */
-		_dosmaperr(GetLastError());
-	}
-
-	if (!hmap)
-	{
-		ereport(elevel,
-				(errcode_for_dynamic_shared_memory(),
-				 errmsg("could not open shared memory segment \"%s\": %m",
-					name)));
-		return false;
+		if (!hmap)
+		{
+			_dosmaperr(GetLastError());
+			ereport(elevel,
+					(errcode_for_dynamic_shared_memory(),
+					 errmsg("could not open shared memory segment \"%s\": %m",
+							name)));
+			return false;
+		}
 	}
 
 	/* Map it. */
#28Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#27)
Re: Dynamic Shared Memory stuff

On Tue, Apr 15, 2014 at 10:46 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Apr 16, 2014 at 3:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 15, 2014 at 12:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 14, 2014 at 10:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:

For the create case, I'm wondering if we should put the block that
tests for !hmap *before* the _dosmaperr() and check for EEXIST. What
is your opinion?

Either way is okay, but I think the way you are suggesting is better as it
will make code consistent with other place (PGSharedMemoryCreate()).

OK, can you prepare a patch?

Please find attached patch to address this issue.
One minor point to note is that now we have to call GetLastError() twice,
once inside error path and once to check EEXIST, but I think that is okay
as existing code in PGSharedMemoryCreate() does it that way.

OK. I committed this blindly, but I don't have a Windows dev
environment, so please keep an eye on the Windows buildfarm members
and provide follow-on patches if any of them get unhappy about this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Craig Ringer
craig@2ndquadrant.com
In reply to: Robert Haas (#28)
Re: Buildfarm "master-next" branch? (was: Dynamic Shared Memory stuff)

On 04/17/2014 12:08 AM, Robert Haas wrote:

On Tue, Apr 15, 2014 at 10:46 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Apr 16, 2014 at 3:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 15, 2014 at 12:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 14, 2014 at 10:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:

For the create case, I'm wondering if we should put the block that
tests for !hmap *before* the _dosmaperr() and check for EEXIST. What
is your opinion?

Either way is okay, but I think the way you are suggesting is better as it
will make code consistent with other place (PGSharedMemoryCreate()).

OK, can you prepare a patch?

Please find attached patch to address this issue.
One minor point to note is that now we have to call GetLastError() twice,
once inside error path and once to check EEXIST, but I think that is okay
as existing code in PGSharedMemoryCreate() does it that way.

OK. I committed this blindly, but I don't have a Windows dev
environment, so please keep an eye on the Windows buildfarm members
and provide follow-on patches if any of them get unhappy about this.

Given that we're doing this a fair bit, is it reasonable to define a
"master-next" branch in git and have the buildfarm (or at least the
Windows members) build that?

Permit master-next to be rebased and reset.

That way it's possible to fire stuff off and see what happens on the
buildfarm without introducing broken commits unnecessarily.

Thoughts?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Robert Haas
robertmhaas@gmail.com
In reply to: Craig Ringer (#29)
Re: Buildfarm "master-next" branch? (was: Dynamic Shared Memory stuff)

On Wed, Apr 16, 2014 at 8:24 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 04/17/2014 12:08 AM, Robert Haas wrote:

On Tue, Apr 15, 2014 at 10:46 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Apr 16, 2014 at 3:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 15, 2014 at 12:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 14, 2014 at 10:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:

For the create case, I'm wondering if we should put the block that
tests for !hmap *before* the _dosmaperr() and check for EEXIST. What
is your opinion?

Either way is okay, but I think the way you are suggesting is better as it
will make code consistent with other place (PGSharedMemoryCreate()).

OK, can you prepare a patch?

Please find attached patch to address this issue.
One minor point to note is that now we have to call GetLastError() twice,
once inside error path and once to check EEXIST, but I think that is okay
as existing code in PGSharedMemoryCreate() does it that way.

OK. I committed this blindly, but I don't have a Windows dev
environment, so please keep an eye on the Windows buildfarm members
and provide follow-on patches if any of them get unhappy about this.

Given that we're doing this a fair bit, is it reasonable to define a
"master-next" branch in git and have the buildfarm (or at least the
Windows members) build that?

Permit master-next to be rebased and reset.

That way it's possible to fire stuff off and see what happens on the
buildfarm without introducing broken commits unnecessarily.

Thoughts?

In this particular case, I have a lot of confidence that Amit tested
this on his own machine before sending in the patch; and moreover, he
wrote that code in the first place. So it's no worse than it would
have been if that change had been in the originally committed version,
which I didn't personally test on Windows, either, but which has
nevertheless mostly passed buildfarm testing. Arguably, if I'm going
to be hacking on platform-dependent things very often, I should get my
own Windows build environment set up so that I can test it myself, but
it hasn't quite been worth it to me thus far, and Amit has proven to
be pretty reliable in terms of getting things right.

In terms of improving the buildfarm infrastructure, the thing I would
most like to have is more frequent runs. It would be great if pushing
a commit to the master repository triggered an immediate build on
every buildfarm animal so that you could see all of the failures in a
short period of time. But that would require more resources for the
buildfarm machines, which are provided on a strictly volunteer basis,
so it's hard to see how to arrange that.

But the ability to easily spin up temporary branches for testing would
also be great. Unfortunately, I suspect that only a minority of the
buildfarm owners would choose to participate, which would make it less
useful, but if we could solve that problem I'd be all in favor of it.
I'm not volunteering to do the work, though.

Honestly, I don't think we have a huge problem here today. Yeah, the
buildfarm turns pretty colors on a fairly regular basis, but those
issues are also generally fixed very quickly. With the unfortunate
exception of the seemingly never-ending stream multixact-related bugs,
a user who took a snapshot of our master branch at a randomly selected
point during the 9.4 development cycle would likely have gotten code
reliable enough to be run in production.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#30)
Re: Buildfarm "master-next" branch?

On 04/17/2014 09:17 AM, Robert Haas wrote:

In terms of improving the buildfarm infrastructure, the thing I would
most like to have is more frequent runs. It would be great if pushing
a commit to the master repository triggered an immediate build on
every buildfarm animal so that you could see all of the failures in a
short period of time. But that would require more resources for the
buildfarm machines, which are provided on a strictly volunteer basis,
so it's hard to see how to arrange that.

Some buildfarm owners run at pretty high frequency - I know there are
cron jobs on some running every 15 minutes. My own Linux and FBSD
machines run every hour. Windows builds take longer - depending on other
use of resources they can run a couple of hours per branch. Also my two
Windows machines doing buildfarm work are running a total of 5 animals,
so the runs are staggered - on Windows 8 the two animals each run every
3 hours. Note that each run potentially builds all the branches, if
there has been some backported change, and the windows animals are set
up so that if animal A on the same machine is running when animal B's
run time comes around animal B skips it scheduled run. So sometimes you
do have to wait a bit. If someone were to providfe me with a bunch of
nice fast Windows VMs I would set them up with one animal a piece with
frequent runs and we might get a lot better coverage. But I am tapped
out as far as the resources I can provide go.

But the ability to easily spin up temporary branches for testing would
also be great. Unfortunately, I suspect that only a minority of the
buildfarm owners would choose to participate, which would make it less
useful, but if we could solve that problem I'd be all in favor of it.
I'm not volunteering to do the work, though.

The buildfarm's original purpose was to give early warning of
platform-specific problems of code we had *already* decided on. Now
projects morph, so we might decide to do something like this. But we'd
need to think long and hard about it. Postgres has not historically used
short-lived branches. I don't much like Craig's idea of a long-lived
testing branch that we're going to do commits and reverts on. If we're
going to do something like this it would be much better to make some
provision for short-lived topic branches. e.g. say we allowed branches
with names like testme-platformname-featurename, (platformname here
could be a magic "all", or a comma-separated list of names such as
linux, freebsd, windows). Wnen testing is done, we could merge the
branch if the testing worked out OK, or drop it if the testing proved to
be a failure.

There would be some work to make the buildfarm client suitable for this.
And we'd probably need a "testing dashboard" so as to keep the main
dashboard page free of test branch results.

Of course, all this would be done in my copious spare time *cough*. I'm
not sure this would be the best use of it.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#31)
Re: Buildfarm "master-next" branch?

Andrew Dunstan <andrew@dunslane.net> writes:

On 04/17/2014 09:17 AM, Robert Haas wrote:

In terms of improving the buildfarm infrastructure, the thing I would
most like to have is more frequent runs.

IMO the best single thing that could happen for the buildfarm is if
we had more critters (at least twice as many) running a wider variety of
platforms, compilers, and configuration options than there are today.
More frequent runs would come out of that automatically.

... But that would require more resources for the
buildfarm machines, which are provided on a strictly volunteer basis,
so it's hard to see how to arrange that.

I don't think we've tried hard lately to get people to sign up. Maybe
we should ask the -advocacy crew to do something.

But the ability to easily spin up temporary branches for testing would
also be great. Unfortunately, I suspect that only a minority of the
buildfarm owners would choose to participate, which would make it less
useful, but if we could solve that problem I'd be all in favor of it.

... Of course, all this would be done in my copious spare time *cough*. I'm
not sure this would be the best use of it.

I agree that this would not be worth the effort needed to make it happen.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#32)
Re: Buildfarm "master-next" branch?

On 04/17/2014 10:38 PM, Tom Lane wrote:

IMO the best single thing that could happen for the buildfarm is if
we had more critters (at least twice as many) running a wider variety of
platforms, compilers, and configuration options than there are today.
More frequent runs would come out of that automatically.

I'll be bringing up a new Windows buildfarm member once I've got a
current project knocked off. It's a pretty fast dedicated Windows Server
2012 box with a wide range of SDKs on it that can do 32-bit and 64-bit
builds.

Should help a little.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34Jim Nasby
jim@nasby.net
In reply to: Tom Lane (#32)
Re: Buildfarm "master-next" branch?

On 4/17/14, 9:38 AM, Tom Lane wrote:

But the ability to easily spin up temporary branches for testing would

also be great. Unfortunately, I suspect that only a minority of the
buildfarm owners would choose to participate, which would make it less
useful, but if we could solve that problem I'd be all in favor of it.

... Of course, all this would be done in my copious spare time*cough*. I'm
not sure this would be the best use of it.

I agree that this would not be worth the effort needed to make it happen.

There's also a sizeable security risk there, of someone putting something malicious in a branch and then triggering a run from that branch. I suppose that could be overcome if this was purposefully limited to the main git repo that only our core committers had access to, but we'd need to be careful.
--
Jim C. Nasby, Data Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35Magnus Hagander
magnus@hagander.net
In reply to: Jim Nasby (#34)
Re: Buildfarm "master-next" branch?

On Tue, Apr 29, 2014 at 9:11 PM, Jim Nasby <jim@nasby.net> wrote:

On 4/17/14, 9:38 AM, Tom Lane wrote:

But the ability to easily spin up temporary branches for testing would

also be great. Unfortunately, I suspect that only a minority of the
buildfarm owners would choose to participate, which would make it less
useful, but if we could solve that problem I'd be all in favor of it.

... Of course, all this would be done in my copious spare time*cough*.

I'm

not sure this would be the best use of it.

I agree that this would not be worth the effort needed to make it happen.

There's also a sizeable security risk there, of someone putting something
malicious in a branch and then triggering a run from that branch. I suppose
that could be overcome if this was purposefully limited to the main git
repo that only our core committers had access to, but we'd need to be
careful.

I would suggest a separate repo to keep the main one "clean", but other
than that, yes, it would have to be limited to the same committers as the
rest I think.

It's reasonably easy to set up build environments in containers/jais on
many Unix boxes where that would actually not be a problem (just blow the
whole jail away once the build is complete), but one of the main platforms
that people would want to use this on I bet is Windows, which has no such
facilities AFAIK.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/