Applications that leak connections

Started by Paul Tillotsonabout 21 years ago20 messagesgeneral
Jump to latest
#1Paul Tillotson
pntil@shentel.net

Does anyone have any useful info about dealing with applications which
"leak" connections, eventually causing "connection limit exceeded for
non superusers?"

Obviously fixing the app is the best choice, but I have a feeling that
this is going to recur again and again since the app is poorly
maintained but still under active development (code base too large for
the developer assigned to it). The developer is likely to accidentally
cause connection leaks in the future.

This problem is also very easy to cause because doing this:

c = sqlconnect("database")
c = sqlconnect("database")

Causes TWO connections to be made, and the first connection made is now
(as far as I can see) impossible to close without closing the
application. The language (Visual Foxpro) does not garbage collect
connections and so any connection whose handle is lost still takes up
one of the slots.

On the server side, I can detect this condition using ps -A or lsof; but
I have not thought of a good solution to use when this is noticed. One
possibility is to just shut down the database and restart it when this
happens. I did some research into finding out if it was possible to
break a specific TCP connection (this is linux 2.6 kernel--platform
specific tools/commands ok). For example, could I use hping to break
the connections?

Does anyone know a safe way to shutdown just one backend (if it is idle)?

Another possibility that I can think of is per ip address connection
limits. How hard would that be?

Paul Tillotson

#2Jonel Rienton
jonel@road14.com
In reply to: Paul Tillotson (#1)
Re: Applications that leak connections

you are perfectly right, fixing the source of the problem is the best way to
fix this. i always make it a habit to close my connections as soon as i'm
done with it. if foxpro won't do it for you, you have to manage your own
resources, always a good practice for unmanaged applications.

i suggest you parse your code for all lines that opens the connection and
close it where it's not needed anymore.

good luck.

regards,

On Thu, 03 Feb 2005 18:27:46 -0500, Paul Tillotson wrote

Does anyone have any useful info about dealing with applications
which "leak" connections, eventually causing "connection limit
exceeded for non superusers?"

Obviously fixing the app is the best choice, but I have a feeling
that this is going to recur again and again since the app is poorly
maintained but still under active development (code base too large
for the developer assigned to it). The developer is likely to
accidentally cause connection leaks in the future.

This problem is also very easy to cause because doing this:

c = sqlconnect("database")
c = sqlconnect("database")

Causes TWO connections to be made, and the first connection made is
now
(as far as I can see) impossible to close without closing the
application. The language (Visual Foxpro) does not garbage collect
connections and so any connection whose handle is lost still takes
up one of the slots.

On the server side, I can detect this condition using ps -A or lsof;
but I have not thought of a good solution to use when this is
noticed. One possibility is to just shut down the database and
restart it when this happens. I did some research into finding out
if it was possible to break a specific TCP connection (this is linux
2.6 kernel--platform specific tools/commands ok). For example,
could I use hping to break the connections?

Does anyone know a safe way to shutdown just one backend (if it is
idle)?

Another possibility that I can think of is per ip address connection
limits. How hard would that be?

Paul Tillotson

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command

(send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)

--
Jonel Rienton
http://blogs.road14.com
Software Developer, *nix Advocate

#3Neil Conway
neilc@samurai.com
In reply to: Paul Tillotson (#1)
Re: Applications that leak connections

Paul Tillotson wrote:

Does anyone know a safe way to shutdown just one backend

Sending it a SIGTERM via kill(1) should be safe.

-Neil

#4Thomas F.O'Connell
tfo@sitening.com
In reply to: Neil Conway (#3)
Safely Killing Backends (Was: Applications that leak connections)

Is there any stronger medicine that's available (for instance, when a
backend won't respond to SIGTERM) and has no unfortunate side effects?
I just ran into this situation the other day (and made the unfortunate
discovery that SIGABRT is as bad as SIGKILL as far as a postmaster is
concerned).

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Feb 4, 2005, at 1:57 AM, Neil Conway wrote:

Show quoted text

Paul Tillotson wrote:

Does anyone know a safe way to shutdown just one backend

Sending it a SIGTERM via kill(1) should be safe.

-Neil

#5Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Thomas F.O'Connell (#4)
Re: Safely Killing Backends (Was: Applications that leak connections)

On Fri, Feb 04, 2005 at 01:44:10PM -0600, Thomas F.O'Connell wrote:

Is there any stronger medicine that's available (for instance, when a
backend won't respond to SIGTERM) and has no unfortunate side effects?
I just ran into this situation the other day (and made the unfortunate
discovery that SIGABRT is as bad as SIGKILL as far as a postmaster is
concerned).

As soon as a backend dies a unnatural death, postmaster will rightly
consider that it may have corrupted the shared state. In turn
postmaster will kill all its children mercilessly so they don't spread
the disease.

Even SIGTERM can have bad consequences if it arrives at the wrong time.
(That's why a function to close a remote connection was rejected.)

So, short answer: no.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"La virtud es el justo medio entre dos defectos" (Arist�teles)

#6Thomas F.O'Connell
tfo@sitening.com
In reply to: Alvaro Herrera (#5)
Re: Safely Killing Backends (Was: Applications that leak connections)

Which brings up a follow-up question: is it documented anywhere exactly
what goes on in recovery mode? If so, I've not found it.

When I've experienced this, it has seemed quicker just to stop and
restart postgres than to let recovery mode complete. Is that unsafe?

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Feb 4, 2005, at 1:56 PM, Alvaro Herrera wrote:

Show quoted text

On Fri, Feb 04, 2005 at 01:44:10PM -0600, Thomas F.O'Connell wrote:

Is there any stronger medicine that's available (for instance, when a
backend won't respond to SIGTERM) and has no unfortunate side effects?
I just ran into this situation the other day (and made the unfortunate
discovery that SIGABRT is as bad as SIGKILL as far as a postmaster is
concerned).

As soon as a backend dies a unnatural death, postmaster will rightly
consider that it may have corrupted the shared state. In turn
postmaster will kill all its children mercilessly so they don't spread
the disease.

Even SIGTERM can have bad consequences if it arrives at the wrong time.
(That's why a function to close a remote connection was rejected.)

So, short answer: no.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"La virtud es el justo medio entre dos defectos" (Aristóteles)

#7Jim Wilson
jimw@kelcomaine.com
In reply to: Thomas F.O'Connell (#6)
Re: Safely Killing Backends (Was: Applications that leak connections)

On Fri, Feb 04, 2005 at 01:44:10PM -0600, Thomas F.O\'Connell wrote:

Is there any stronger medicine that\'s available (for instance, when

a

backend won\'t respond to SIGTERM) and has no unfortunate side

effects?

I just ran into this situation the other day (and made the

unfortunate

discovery that SIGABRT is as bad as SIGKILL as far as a postmaster

is

concerned).

As soon as a backend dies a unnatural death, postmaster will rightly
consider that it may have corrupted the shared state. In turn
postmaster will kill all its children mercilessly so they don\'t

spread

the disease.

Even SIGTERM can have bad consequences if it arrives at the wrong

time.

(That\'s why a function to close a remote connection was rejected.)

So, short answer: no.

This could be better than what is however. Management would be easier
if
there was a way to trigger a series of behaviors on a given signal to a
child:

The child (1) cancels and rollbacks any transactions it has open,
(2) enters a mode where it attempts to communicate with the client
and failing so does an orderly connection close.

I would never go back to them, but I can say that the Sybase SQL Studio
servers where much
easier to manage in this regard. If you are not very careful about how
you
handle orphaned connections in Postgres you will likely lose data....not
"maybe" like
a long shot...but "likely". I suppose if your data is fairly static
(e.g. website cms) then this would not happen often, but anything with
a lot of tansactions it will.

The best protection is to do extensive testing with any application you
use or
develop, but that\'s not possible for everyone to do a sufficient amount
of testing to
avoid some of these issues.

If I was submitting patches for Postgres I\'d push a little harder, and
if I were, this
problem would be at the top of my list as things to fix in Postgres.

Best regards,

Jim Wilson

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Wilson (#7)
Re: Safely Killing Backends (Was: Applications that leak connections)

Jim Wilson <jimw@kelcomaine.com> writes:

If you are not very careful about how you handle orphaned connections
in Postgres you will likely lose data....not "maybe" like a long
shot...but "likely".

[ raised eyebrow ... ] Say again? I don't know of any reason why a
lost connection would cause loss of (successfully committed) transactions.
Not even if a DBA with an itchy "kill -9" trigger finger is in charge of
cleaning up the lost connections. Please describe the scenarios you've
had problems with.

regards, tom lane

#9Jim Wilson
jimw@kelcomaine.com
In reply to: Tom Lane (#8)
Re: Safely Killing Backends (Was: Applications that leak connections)

Jim Wilson <jimw@kelcomaine.com> writes:

If you are not very careful about how you handle orphaned

connections

in Postgres you will likely lose data....not "maybe" like a long
shot...but "likely".

[ raised eyebrow ... ] Say again? I don\'t know of any reason why a
lost connection would cause loss of (successfully committed)

transactions.

Not even if a DBA with an itchy "kill -9" trigger finger is in charge

of

cleaning up the lost connections. Please describe the scenarios

you\'ve

had problems with.

regards, tom lane

Well a couple things... One is I\\\'m talking about 7.3.x. We\\\'ll be
moving our servers up to 7.4.x before the spring, but that\\\'s where
these observations have been and maybe there are certain issues at that
level. It is sometimes difficult to track transaction related issues
down anyway, but I can say that in testing and earlier deployment we
saw some things that did not look good and in practice we are very
carefull dealing with lost connection issues. No kill -9 trigger
fingers.

Rather than getting into the raised eyebrow thing ;-), I\\\'d suggest
checking your "qualifiers". Consider that with Postgres, if killing a
single connection brings the whole server down, you will loose _all_
uncommitted data. If you did not, then I would call that a bug. The
weakness is not in the data integrity (directly), it is in the
integrity
of the server processes and their managability.

Best regards,

Jim

#10Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Jim Wilson (#9)
Re: Safely Killing Backends (Was: Applications that leak connections)

On Fri, Feb 04, 2005 at 05:01:43PM -0500, Jim Wilson wrote:

Rather than getting into the raised eyebrow thing ;-), I\\\'d suggest
checking your "qualifiers". Consider that with Postgres, if killing a
single connection brings the whole server down, you will loose _all_
uncommitted data. If you did not, then I would call that a bug. The
weakness is not in the data integrity (directly), it is in the
integrity of the server processes and their managability.

Are you saying that your applications regularly leave uncommitted
transactions for long periods of time? That sounds like bugs in your
applications to me.

Maybe I didn't get the part about lost connections. Do you mean that
you applications lose conectivity to the server, and thus the
transaction they were working with are lost? If that's the case, then
it certainly sounds dangerous to commit whatever was there; what if the
transaction was incomplete? Of course, if you can't commit it, the only
way to proceed is to roll it back.

What's with the backslashes anyway?

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Y dijo Dios: "Que sea Satan�s, para que la gente no me culpe de todo a m�."
"Y que hayan abogados, para que la gente no culpe de todo a Satan�s"

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Wilson (#9)
Re: Safely Killing Backends (Was: Applications that leak connections)

Jim Wilson <jimw@kelcomaine.com> writes:

Rather than getting into the raised eyebrow thing ;-), I\\\'d suggest
checking your "qualifiers". Consider that with Postgres, if killing a
single connection brings the whole server down, you will loose _all_
uncommitted data. If you did not, then I would call that a bug.

I would too. So what's your complaint exactly? That kill -9'ing one
backend takes out all your current uncommitted transactions and not just
the one? I don't regard that as data loss: uncommitted is uncommitted.

regards, tom lane

#12Paul Tillotson
pntil@shentel.net
In reply to: Neil Conway (#3)
Re: Applications that leak connections

Correct me if I am wrong, but doesn't the postmaster notice that
something killed a backend and cause all the other ones to roll back?

Paul Tillotson

Neil Conway wrote:

Show quoted text

Paul Tillotson wrote:

Does anyone know a safe way to shutdown just one backend

Sending it a SIGTERM via kill(1) should be safe.

-Neil

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#13Jim Wilson
jimw@kelcomaine.com
In reply to: Tom Lane (#11)
Re: Safely Killing Backends (Was: Applications that leak connections)

Jim Wilson <jimw@kelcomaine.com> writes:

Rather than getting into the raised eyebrow thing , I\\\\\\\'d

suggest

checking your "qualifiers". Consider that with Postgres, if killing

a

single connection brings the whole server down, you will loose

_all_

uncommitted data. If you did not, then I would call that a bug.

I would too. So what\'s your complaint exactly? That kill -9\'ing

one

backend takes out all your current uncommitted transactions and not

just

the one? I don\'t regard that as data loss: uncommitted is

uncommitted.

regards, tom lane

We\'d like to be able to take out a connection without risking
postmaster going
down and thus losing all uncommitted data.

Unfortunately the whole world isn\'t encapsulated in Postgres
transactions or we
would never have to worry about uncommitted data. Just because your
tables are
in sync, and your applications are designed in a way that coordinates
with real
world events and/or multiple entity transactions (e.g. electronic
payments) as well
as possible, doesn\'t mean that data integrity isn\'t put at risk by the
loss of
uncommitted information.

This "problem", which is always there for the application designer, is
exasperated
when the whole server goes down because of an admin who, as delicately
as possible,
is just trying to remove an orphaned connection that\'s blocking
transactions. Would
I stop using Postgres if this was never changed? No way. Do I think it
could be
better than it is? Absolutely. For our purposes, it would be more
important than
some things like further query optimization, etc., that would be at the
top of other
user\'s lists.

Best regards,

Jim

#14Jim Wilson
jimw@kelcomaine.com
In reply to: Jim Wilson (#13)
Re: Safely Killing Backends (Was: Applications that leak connections)

On Fri, Feb 04, 2005 at 05:01:43PM -0500, Jim Wilson wrote:

Rather than getting into the raised eyebrow thing , I\\\\\\\'d

suggest

checking your "qualifiers". Consider that with Postgres, if killing

a

single connection brings the whole server down, you will loose

_all_

uncommitted data. If you did not, then I would call that a bug. The
weakness is not in the data integrity (directly), it is in the
integrity of the server processes and their managability.

Are you saying that your applications regularly leave uncommitted
transactions for long periods of time? That sounds like bugs in your
applications to me.

I never said that.

<snip>

What\'s with the backslashes anyway?

Well, I\'m beta testing an email client. Good question! :-)

Best,

Jim Wilson

#15Marco Colombo
pgsql@esiway.net
In reply to: Jim Wilson (#13)
Re: Safely Killing Backends (Was: Applications that leak connections)

On Fri, 4 Feb 2005, Jim Wilson wrote:

regards, tom lane

We\'d like to be able to take out a connection without risking
postmaster going
down and thus losing all uncommitted data.

Unfortunately the whole world isn\'t encapsulated in Postgres
transactions or we
would never have to worry about uncommitted data. Just because your
tables are
in sync, and your applications are designed in a way that coordinates
with real
world events and/or multiple entity transactions (e.g. electronic
payments) as well
as possible, doesn\'t mean that data integrity isn\'t put at risk by the
loss of
uncommitted information.

I'm sorry, but I cannot make any sense out of the last sentence.
By definition, data integrity refers only to committed data. Uncommitted
data is not data at all, for both the parties involved. The server may
receive a rollback anytime (which is not a delete!) and the client
may get an error from the server, even at commit time (the last step).
The network may go down in the middle of it. _Anything_ may happen.
If your application is not able to handle a failure _before_ commit time,
you should fix it.

This "problem", which is always there for the application designer, is
exasperated
when the whole server goes down because of an admin who, as delicately
as possible,
is just trying to remove an orphaned connection that\'s blocking
transactions. Would
I stop using Postgres if this was never changed? No way. Do I think it
could be
better than it is? Absolutely. For our purposes, it would be more
important than
some things like further query optimization, etc., that would be at the
top of other
user\'s lists.

I think that the most important thing, for your purpose, as an "application
designer", is to fix your application. There's no such a thing as
"uncommitted data", it's not an intermediate state, it's not some
kind of "less important" data. Let T be the commit time, before T
there's _nothing_ and after T there's _everything_ (all of it).
That's the A in ACID.

Your application should handle failures in the middle of a transaction,
connection failures included, in a graceful but correct way.

I see your point (being able to safely shut a connection down on the
server side), but it's at the _bottom_ of any list.

.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo@ESI.it

#16Jim Wilson
jimw@kelcomaine.com
In reply to: Marco Colombo (#15)
Re: Safely Killing Backends (Was: Applications that leak connections)

Your application should handle failures in the middle of a

transaction,

connection failures included, in a graceful but correct way.

It does very well, until the next bug is discovered.

I see your point (being able to safely shut a connection down on the
server side), but it\'s at the _bottom_ of any list.

.TM.
--
/ / /
/ / / Marco Colombo

That\'s unfortunate. I\'ve tried to explain my position off list to
Marco,
but it really isn\'t worth debating. FWIW I think this thread was
started
by someone with application issues. The fact is, such things happen.

Unfortunately Marco choses speaks for "any list" and I\'ll just
repeat that I find this instability issue the most significant drawback

for Postgres installations. This doesn\'t mean that there aren\'t other
areas
of priority for other users. And no, I do not want to debate the
meaning
of the word "instability". :-)

Best regards,

Jim Wilson

#17Marco Colombo
pgsql@esiway.net
In reply to: Jim Wilson (#16)
Re: Safely Killing Backends (Was: Applications that leak connections)

On Tue, 8 Feb 2005, Jim Wilson wrote:

Your application should handle failures in the middle of a

transaction,

connection failures included, in a graceful but correct way.

It does very well, until the next bug is discovered.

I see your point (being able to safely shut a connection down on the
server side), but it\'s at the _bottom_ of any list.

.TM.
--
/ / /
/ / / Marco Colombo

That\'s unfortunate. I\'ve tried to explain my position off list to
Marco,
but it really isn\'t worth debating. FWIW I think this thread was
started
by someone with application issues. The fact is, such things happen.

Unfortunately Marco choses speaks for "any list" and I\'ll just
repeat that I find this instability issue the most significant drawback

for Postgres installations. This doesn\'t mean that there aren\'t other
areas
of priority for other users. And no, I do not want to debate the
meaning
of the word "instability". :-)

Best regards,

Jim Wilson

As I wrote in private mail, authenticated clients have many means to
perform a DoS attack (whether intentionally or not). Most of cases
can be handled only with a server restart. To put simply, PostgreSQL
is not designed to handle hostile clients well.

IMHO, a friendly enviroment (client behaviour) is a safe assumption
for a RDBMS. It's not its job to paperbag over application bugs.

Anyway, I agree in ending this thread.
I recognize we have different meanings for "instability" and "data loss".

.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo@ESI.it

#18Martijn van Oosterhout
kleptog@svana.org
In reply to: Jim Wilson (#16)
Re: Safely Killing Backends (Was: Applications that leak connections)

On Tue, Feb 08, 2005 at 07:31:13AM -0500, Jim Wilson wrote:

That\'s unfortunate. I\'ve tried to explain my position off list to
Marco, but it really isn\'t worth debating. FWIW I think this thread
was started by someone with application issues. The fact is, such
things happen.

Well, I read the thread on pg-hackers [1]http://archives.postgresql.org/pgsql-patches/2004-07/msg00457.php about this being a bad idea
currently and the issue seems to be:

1. The SIGTERM is the same as a FATAL error and this code path has not
been very well tested. Are locks, etc all correctly removed? The only
cases that *are* well tested are cases where these things don't matter.

In other words, it will probably work fine, but it's not so well tested
that the pg hackers are willing to bless a backend function
implementing it.

2. If the backend is so stuck that SIGTERM isn't working, then I guess
that's a bug but not enough examples have been collected to work out
the problem. In this case you probably can't exit without considering
the shared memory corrupt.

3. In theory it would be nice to have a "cancel then exit" signal, but
we're clean out of signal numbers.

4. It appears the original person had a problem with not tracking used
resources properly in a language that neither garbage-collects nor
reference-counts. If you know you only ever want to open one connection
you can solve this problem by creating an open_connection function
which checks a global variable to see if a connection has already been
opened and returns the same one if it has.

Unfortunately Marco choses speaks for "any list" and I\'ll just
repeat that I find this instability issue the most significant
drawback for Postgres installations. This doesn\'t mean that there
aren\'t other areas of priority for other users. And no, I do not
want to debate the meaning of the word "instability". :-)

I guess it appears on the list of anybody who regularly deals with this
problem. That list appears to be mutally exclusive with anyone who can
fix it...

I wonder how one would test the SIGTERM path anyway... To quote Tom
Lane on chances of corruption [2]http://archives.postgresql.org/pgsql-patches/2004-07/msg00480.php:

Not only wouldn't I give you those odds today, but I don't think we
could ever get to the point of saying that session kill is that
reliable, at least not from our ordinary methods of field testing.
It'd require significant focused code review and testing to acquire
such confidence, and continuing effort to make sure we didn't break
it again in the future.

If we had infinite manpower I'd be happy to delegate a developer or
three to stay on top of this particular issue. But we don't :-(

I don't know if PostgreSQL has ever had the concept of bounties for
stuff. It's an interesting idea...

[1]: http://archives.postgresql.org/pgsql-patches/2004-07/msg00457.php
[2]: http://archives.postgresql.org/pgsql-patches/2004-07/msg00480.php

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#19Vivek Khera
khera@yertle.int.kciLink.com
In reply to: Paul Tillotson (#1)
Re: Safely Killing Backends

"TFO" == Thomas F O'Connell <Thomas> writes:

TFO> Which brings up a follow-up question: is it documented anywhere
TFO> exactly what goes on in recovery mode? If so, I've not found it.

TFO> When I've experienced this, it has seemed quicker just to stop and
TFO> restart postgres than to let recovery mode complete. Is that unsafe?

The recovery has to happen at some point. What it is doing is
bringing your DB to a known valid state based on the committed
transactions, what's written to disk, and what's written to the
pg_xlog files. A vacuum is probably in good order after this happens.

Effectively, it does the moral equivalent of unplugging the power cord
and restarting itself, without the bother of needing to reboot the
whole machine :-)

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

#20Thomas F.O'Connell
tfo@sitening.com
In reply to: Vivek Khera (#19)
Re: Safely Killing Backends

My point/question is: when I've seen this before and recovery mode is
taking on the order of minutes, I find that doing:

pg_ctl stop -m i
pg_ctl start

is faster (on the order of seconds) than letting postgres finish
recovery mode. So I wonder:

1. Is this safe from a data integrity point of view?
2. Why is it faster?

Maybe the difference in time I've experienced is partially a result of
the number of connections that come in (via PHP) during recovery mode,
so recovering takes longer because the database is still in multi-user
mode and receiving connections even if not fulfulling requests?

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Feb 8, 2005, at 2:57 PM, Vivek Khera wrote:

Show quoted text

TFO> Which brings up a follow-up question: is it documented anywhere
TFO> exactly what goes on in recovery mode? If so, I've not found it.

TFO> When I've experienced this, it has seemed quicker just to stop and
TFO> restart postgres than to let recovery mode complete. Is that
unsafe?

The recovery has to happen at some point. What it is doing is
bringing your DB to a known valid state based on the committed
transactions, what's written to disk, and what's written to the
pg_xlog files. A vacuum is probably in good order after this happens.

Effectively, it does the moral equivalent of unplugging the power cord
and restarting itself, without the bother of needing to reboot the
whole machine :-)