idle_in_transaction_timeout

Started by Vik Fearingalmost 12 years ago102 messageshackers
Jump to latest
#1Vik Fearing
vik@postgresfriends.org

This patch implements a timeout for broken clients that idle in transaction.

This is a TODO item.

When the timeout occurs, the backend commits suicide with a FATAL
ereport. I thought about just aborting the transaction to free the
locks but decided that the client is clearly broken so might as well
free up the connection as well.

The same behavior can be achieved by an external script that monitors
pg_stat_activity but by having it in core we get much finer tuning (it
is session settable) and also traces of it directly in the PostgreSQL
logs so it can be graphed by things like pgbadger.

Unfortunately, no notification is sent to the client because there's no
real way to do that right now.

--
Vik

Attachments:

iitt.v1.patchtext/x-diff; name=iitt.v1.patchDownload+67-14
#2Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Vik Fearing (#1)
Re: idle_in_transaction_timeout

At 2014-06-03 15:06:11 +0200, vik.fearing@dalibo.com wrote:

This patch implements a timeout for broken clients that idle in
transaction.

I think this is a nice feature, but I suggest that (at the very least)
the GUC should be named "idle_transaction_timeout".

+        <para>
+         Terminate any session that is idle in transaction for longer than the specified
+         number of seconds.  This not only allows any locks to be released, but also allows
+         the connection slot to be reused.  However, aborted idle in transaction sessions
+         are not affected.  A value of zero (the default) turns this off.
+        </para>

I suggest:

Terminate any session with an open transaction that has been idle
for longer than the specified duration in seconds. This allows any
locks to be released and the connection slot to be reused.

It's not clear to me what "However, aborted idle in transaction sessions
are not affected" means.

The default value of 0 means that such sessions will not be
terminated.

-- Abhijit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Vik Fearing
vik@postgresfriends.org
In reply to: Abhijit Menon-Sen (#2)
Re: idle_in_transaction_timeout

On 06/03/2014 03:30 PM, Abhijit Menon-Sen wrote:

At 2014-06-03 15:06:11 +0200, vik.fearing@dalibo.com wrote:

This patch implements a timeout for broken clients that idle in
transaction.

I think this is a nice feature, but I suggest that (at the very least)
the GUC should be named "idle_transaction_timeout".

I prefer the way I have it, but not enough to put up a fight if other
people like your version better.

+        <para>
+         Terminate any session that is idle in transaction for longer than the specified
+         number of seconds.  This not only allows any locks to be released, but also allows
+         the connection slot to be reused.  However, aborted idle in transaction sessions
+         are not affected.  A value of zero (the default) turns this off.
+        </para>

I suggest:

Terminate any session with an open transaction that has been idle
for longer than the specified duration in seconds. This allows any
locks to be released and the connection slot to be reused.

It's not clear to me what "However, aborted idle in transaction sessions
are not affected" means.

The default value of 0 means that such sessions will not be
terminated.

How about this?

Terminate any session with an open transaction that has been idle
for longer than the specified duration in seconds. This allows any
locks to be released and the connection slot to be reused.

Sessions in the state "idle in transaction (aborted)" occupy a
connection slot but because they hold no locks, they are not
considered by this parameter.

The default value of 0 means that such sessions will not be
terminated.
--
Vik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Vik Fearing (#3)
Re: idle_in_transaction_timeout

Vik Fearing wrote

On 06/03/2014 03:30 PM, Abhijit Menon-Sen wrote:

At 2014-06-03 15:06:11 +0200,

vik.fearing@

wrote:

This patch implements a timeout for broken clients that idle in
transaction.

I think this is a nice feature, but I suggest that (at the very least)
the GUC should be named "idle_transaction_timeout".

I prefer the way I have it, but not enough to put up a fight if other
people like your version better.

+

<para>

+         Terminate any session that is idle in transaction for longer
than the specified
+         number of seconds.  This not only allows any locks to be
released, but also allows
+         the connection slot to be reused.  However, aborted idle in
transaction sessions
+         are not affected.  A value of zero (the default) turns this
off.
+        

</para>

I suggest:

Terminate any session with an open transaction that has been idle
for longer than the specified duration in seconds. This allows any
locks to be released and the connection slot to be reused.

It's not clear to me what "However, aborted idle in transaction sessions
are not affected" means.

The default value of 0 means that such sessions will not be
terminated.

How about this?

Terminate any session with an open transaction that has been idle
for longer than the specified duration in seconds. This allows any
locks to be released and the connection slot to be reused.

Sessions in the state "idle in transaction (aborted)" occupy a
connection slot but because they hold no locks, they are not
considered by this parameter.

The default value of 0 means that such sessions will not be
terminated.
--
Vik

I do not see any reason for an aborted idle in transaction session to be
treated differently.

Given the similarity to "statement_timeout" using similar wording for both
would be advised.

*Statement Timeout:*
"Abort any statement that takes more than the specified number of
milliseconds, starting from the time the command arrives at the server from
the client. If log_min_error_statement is set to ERROR or lower, the
statement that timed out will also be logged. A value of zero (the default)
turns this off.

Setting statement_timeout in postgresql.conf is not recommended because it
would affect all sessions."

*Idle Transaction Timeout:*

Disconnect any session that has been "idle in transaction" (including
aborted) for more than the specified number of milliseconds, starting from
{however such is determined}.

A value of zero (the default) turns this off.

Typical usage would be to set this to a small positive number in
postgresql.conf and require sessions that expect long-running, idle,
transactions to set it back to zero or some reasonable higher value.

While seconds is the better unit of measure I don't think that justifies
making this different from statement_timeout. In either case users can
specify their own units.

Since we are killing the entire session, not just the transaction, a better
label would:

idle_transaction_session_timeout

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/idle-in-transaction-timeout-tp5805859p5805904.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Josh Berkus
josh@agliodbs.com
In reply to: Vik Fearing (#1)
Re: idle_in_transaction_timeout

Vik,

When the timeout occurs, the backend commits suicide with a FATAL
ereport. I thought about just aborting the transaction to free the
locks but decided that the client is clearly broken so might as well
free up the connection as well.

Out of curiosity, how much harder would it have been just to abort the
transaction? I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

Argument in favor of breaking the connection: most of the time, IITs are
caused by poor error-handling or garbage-collection code on the client
side, which has already abandoned the application session but hadn't let
go of the database handle. Since the application is never going to
reuse the handle, terminating it is the right thing to do.

Argument in favor of just aborting the transaction: client connection
management software may not be able to deal cleanly with the broken
connection and may halt operation completely, especially if the client
finds out the connection is gone when they try to start their next
transaction on that connection.

My $0.019999999999998: terminating the connection is the preferred behavior.

The same behavior can be achieved by an external script that monitors
pg_stat_activity but by having it in core we get much finer tuning (it
is session settable) and also traces of it directly in the PostgreSQL
logs so it can be graphed by things like pgbadger.

Unfortunately, no notification is sent to the client because there's no
real way to do that right now.

Well, if you abort the connection, presumably the client finds out when
they try to send a query ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#5)
Re: idle_in_transaction_timeout

Josh Berkus <josh@agliodbs.com> writes:

Out of curiosity, how much harder would it have been just to abort the
transaction? I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss. (I think, anyway.)

The argument that we might want to close the connection to free up
connection slots doesn't seem to me to hold water as long as we allow
a client that *isn't* inside a transaction to sit on an idle connection
forever. Perhaps there is room for a second timeout that limits how
long you can sit idle independently of being in a transaction, but that
isn't this patch.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Josh Berkus (#5)
Re: idle_in_transaction_timeout

Josh Berkus wrote:

Argument in favor of breaking the connection: most of the time, IITs are
caused by poor error-handling or garbage-collection code on the client
side, which has already abandoned the application session but hadn't let
go of the database handle. Since the application is never going to
reuse the handle, terminating it is the right thing to do.

In some frameworks where garbage-collection is not involved with things
like this, what happens is that the connection object is reused later.
If you just drop the connection, the right error message will never
reach the application, but if you abort the xact, the next BEGIN issued
by the framework will receive the "connection aborted due to
idle-in-transaction session" message, which I assume is what this patch
would have sent.

Therefore I think if we want this at all it should abort the xact, not
drop the connection.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#6)
Re: idle_in_transaction_timeout

On 06/03/2014 05:53 PM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

Out of curiosity, how much harder would it have been just to abort the
transaction? I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss. (I think, anyway.)

The argument that we might want to close the connection to free up
connection slots doesn't seem to me to hold water as long as we allow
a client that *isn't* inside a transaction to sit on an idle connection
forever. Perhaps there is room for a second timeout that limits how
long you can sit idle independently of being in a transaction, but that
isn't this patch.

Yes, I had the same thought.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Josh Berkus
josh@agliodbs.com
In reply to: Vik Fearing (#1)
Re: idle_in_transaction_timeout

On 06/03/2014 02:53 PM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

Out of curiosity, how much harder would it have been just to abort the
transaction? I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss. (I think, anyway.)

Personally, I think we'll get about equal amounts of confusion either way.

The argument that we might want to close the connection to free up
connection slots doesn't seem to me to hold water as long as we allow
a client that *isn't* inside a transaction to sit on an idle connection
forever. Perhaps there is room for a second timeout that limits how
long you can sit idle independently of being in a transaction, but that
isn't this patch.

Like I said, I'm marginally in favor of terminating the connection, but
I'd be completely satisfied with a timeout which aborted the transaction
instead. Assuming that change doesn't derail this patch and keep it
from getting into 9.5, of course.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Vik Fearing
vik@postgresfriends.org
In reply to: Josh Berkus (#9)
Re: idle_in_transaction_timeout

On 06/04/2014 01:38 AM, Josh Berkus wrote:

On 06/03/2014 02:53 PM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

Out of curiosity, how much harder would it have been just to abort the
transaction? I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss. (I think, anyway.)

Personally, I think we'll get about equal amounts of confusion either way.

The argument that we might want to close the connection to free up
connection slots doesn't seem to me to hold water as long as we allow
a client that *isn't* inside a transaction to sit on an idle connection
forever. Perhaps there is room for a second timeout that limits how
long you can sit idle independently of being in a transaction, but that
isn't this patch.

Like I said, I'm marginally in favor of terminating the connection, but
I'd be completely satisfied with a timeout which aborted the transaction
instead. Assuming that change doesn't derail this patch and keep it
from getting into 9.5, of course.

The change is as simple as changing the ereport from FATAL to ERROR.

Attached is a new patch doing it that way.

--
Vik

Attachments:

iitt.v2.patchtext/x-diff; name=iitt.v2.patchDownload+69-14
#11David G. Johnston
david.g.johnston@gmail.com
In reply to: Josh Berkus (#9)
Re: idle_in_transaction_timeout

On Tue, Jun 3, 2014 at 7:40 PM, Josh Berkus [via PostgreSQL] <
ml-node+s1045698n5805933h47@n5.nabble.com> wrote:

On 06/03/2014 02:53 PM, Tom Lane wrote:

Josh Berkus <[hidden email]

<http://user/SendEmail.jtp?type=node&amp;node=5805933&amp;i=0&gt;&gt; writes:

Out of curiosity, how much harder would it have been just to abort the
transaction? I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss. (I think, anyway.)

Personally, I think we'll get about equal amounts of confusion either way.

The argument that we might want to close the connection to free up
connection slots doesn't seem to me to hold water as long as we allow
a client that *isn't* inside a transaction to sit on an idle connection
forever. Perhaps there is room for a second timeout that limits how
long you can sit idle independently of being in a transaction, but that
isn't this patch.

Like I said, I'm marginally in favor of terminating the connection, but
I'd be completely satisfied with a timeout which aborted the transaction
instead. Assuming that change doesn't derail this patch and keep it
from getting into 9.5, of course.

​Default to dropping the connection but give the usersadministrators the
capability to decide for themselves?

I still haven't heard an argument for why this wouldn't apply to aborted
idle-in-transactions. I get the focus in on releasing locks but if the
transaction fails but still hangs around forever it is just as broken as
one that doesn't fail and hangs around forever. Even if you limit the end
result to only aborting the transaction the end-user will likely want to
distinguish between their transaction failing and their failed transaction
remaining idle too long - if only to avoid the situation where they make
the transaction no longer fail but still hit the timeout.

Whether a connection is a resource the DBA wants to restore (at the expense
of better server-client communication) is a parental decision we shouldn't
force on them given how direct (feelings about GUCs aside). The decision
to force the release of locks - the primary purpose of the patch - is made
by simply using the setting in the first place.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/idle-in-transaction-timeout-tp5805859p5805936.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

#12Vik Fearing
vik@postgresfriends.org
In reply to: David G. Johnston (#11)
Re: idle_in_transaction_timeout

On 06/04/2014 02:01 AM, David G Johnston wrote:

​Default to dropping the connection but give the usersadministrators
the capability to decide for themselves?

Meh.

I still haven't heard an argument for why this wouldn't apply to
aborted idle-in-transactions. I get the focus in on releasing locks
but if the transaction fails but still hangs around forever it is just
as broken as one that doesn't fail and hangs around forever.

My main concern was with locks and blocking VACUUM. Aborted
transactions don't do either of those things. The correct solution is
to terminate aborted transaction, too, or not terminate anything and
abort the idle ones.

Even if you limit the end result to only aborting the transaction the
end-user will likely want to distinguish between their transaction
failing and their failed transaction remaining idle too long - if only
to avoid the situation where they make the transaction no longer fail
but still hit the timeout.

But hitting the timeout *is* failing.

With the new patch, the first query will say that the transaction was
aborted due to timeout. Subsequent queries will do as they've always done.

--
Vik

#13Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: idle_in_transaction_timeout

On Tue, Jun 3, 2014 at 5:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

Out of curiosity, how much harder would it have been just to abort the
transaction? I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss. (I think, anyway.)

I thought the reason why this hasn't been implemented before now is
that sending an ErrorResponse to the client will result in a loss of
protocol sync. Sure, when the client sends the next query, they'll
then read the ErrorResponse. So far so good. The problem is that
they *won't* read whatever we send back as a response to their query,
because they think they already have, when in reality they've only
read the ErrorResponse sent much earlier. This, at least as I've
understood it, is the reason why recovery conflicts kill off idle
sessions entirely instead of just aborting the transaction. Andres
tried to fix that problem a few years ago without much luck; the most
relevant post to this particular issue seems to be:

/messages/by-id/23631.1292521603@sss.pgh.pa.us

Assuming that the state of play hasn't changed in some way I'm unaware
of since 2010, I think the best argument for FATAL here is that it's
what we can implement. I happen to think it's better anyway, because
the cases I've seen where this would actually be useful involve
badly-written applications that are not under the same administrative
control as the database and supposedly have built-in connection
poolers, except sometimes they seem to forget about some of the
connections they themselves opened. The DBAs can't make the app
developers fix the app; they just have to try to survive. Aborting
the transaction is a step in the right direction but since the guy at
the other end of the connection is actually or in effect dead, that
just leaves you with an eternally idle connection. Now we can say "use
TCP keepalives for that" but that only helps if the connection has
actually been severed; if the guy at the other end is still
technically there but is too brain-damaged to actually try to *use*
the connection for anything, it doesn't help. Also, as I recently
discovered, there are still a few machines out there that don't
actually support TCP keepalives on a per-connection basis; you can
only configure it system-wide, and that's not always granular enough.

Anyway, if somebody really wants to go to the trouble of finding a way
to make this work without killing off the connection, I think that
would be cool and useful and whatever technology we develop there
could doubtless could be applied to other situations. But I have a
nervous feeling that might be a hard enough problem to sink the whole
patch, which would be a shame, since the cases I've actually
encountered would be better off with FATAL anyway.

Just my $0.019999999999997 to go with Josh's $0.019999999999998.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#13)
Re: idle_in_transaction_timeout

Robert Haas <robertmhaas@gmail.com> writes:

I thought the reason why this hasn't been implemented before now is
that sending an ErrorResponse to the client will result in a loss of
protocol sync.

Hmm ... you are right that this isn't as simple as an ereport(ERROR),
but I'm not sure it's impossible. We could for instance put the backend
into skip-till-Sync state so that it effectively ignored the next command
message. Causing that to happen might be impracticably messy, though.

I'm not sure whether cancel-transaction behavior is enough better than
cancel-session to warrant extra effort here.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#14)
Re: idle_in_transaction_timeout

On 2014-06-03 23:37:28 -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I thought the reason why this hasn't been implemented before now is
that sending an ErrorResponse to the client will result in a loss of
protocol sync.

Hmm ... you are right that this isn't as simple as an ereport(ERROR),
but I'm not sure it's impossible. We could for instance put the backend
into skip-till-Sync state so that it effectively ignored the next command
message. Causing that to happen might be impracticably messy, though.

Isn't another problem here that we're reading from the client, possibly
nested inside openssl? So we can't just longjmp out without possibly
destroying openssl's internal state?
I think that's solveable by first returning from openssl, signalling a
short read, and only *then* jumping out. I remember making something
like that work in a POC patch at least... But it's been a couple of years.

I'm not sure whether cancel-transaction behavior is enough better than
cancel-session to warrant extra effort here.

I don't think idle_in_transaction_timeout is worth this, but if we could
implement it we could finally have recovery conflicts against idle in
txn sessions not use FATAL...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#14)
Re: idle_in_transaction_timeout

On Tue, Jun 3, 2014 at 11:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I thought the reason why this hasn't been implemented before now is
that sending an ErrorResponse to the client will result in a loss of
protocol sync.

Hmm ... you are right that this isn't as simple as an ereport(ERROR),
but I'm not sure it's impossible. We could for instance put the backend
into skip-till-Sync state so that it effectively ignored the next command
message. Causing that to happen might be impracticably messy, though.

Another thing we could maybe do is AbortCurrentTransaction() and send
the client a NoticeResponse saying "hey, expect all of your future
commands to fail with complaints about the transaction being aborted".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#16)
Re: idle_in_transaction_timeout

Robert Haas <robertmhaas@gmail.com> wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I thought the reason why this hasn't been implemented before
now is that sending an ErrorResponse to the client will result
in a loss of protocol sync.

Hmm ... you are right that this isn't as simple as an
ereport(ERROR), but I'm not sure it's impossible.  We could for
instance put the backend into skip-till-Sync state so that it
effectively ignored the next command message.  Causing that to
happen might be impracticably messy, though.

Another thing we could maybe do is AbortCurrentTransaction() and
send the client a NoticeResponse saying "hey, expect all of your
future commands to fail with complaints about the transaction
being aborted".

Wow, until I read through the thread on this patch and the old
thread that Robert linked to I had forgotten the attempt by Andres
to deal with this three and a half years ago.  That patch died
because of how complicated it was to do the right thing if the
connection was left open.

Personally, where I have seen a use for this, treating it as FATAL
would be better anyway; but was OK with treating it as an ERROR if
that didn't sink the patch.  I guess that puts me in the same camp
as Josh and Robert.

Vik has submitted v1 and v2 of this patch; the only difference
between them is that v1 makes a timeout FATAL and v2 makes it an
ERROR (with appropriate corresponding changes to code comments,
documentation, and message text).  It's not hard to show the
problems with an ERROR under v2, confirming that the simple
approach of just changing the ereport severity is not enough:

test=# set idle_in_transaction_timeout = '3s';
SET
test=# begin;
BEGIN
test=# select 1;
ERROR:  current transaction is aborted due to idle-in-transaction timeout
test=# commit;
ERROR:  current transaction is aborted, commands ignored until end of transaction block
test=# commit;
ROLLBACK

The first thing is that I don't think a delay between the BEGIN and
the SELECT should cause a timeout to trigger, but more importantly
there should not be two ERROR responses to one SELECT statement.

I'm inclined to abandon the ERROR approach as not worth the effort
and fragility, and focus on v1 of the patch.  If we can't get to
consensus on that, I think that this patch should be flagged
"Returned with Feedback", noting that any follow-up version
requires some way to deal with the issues raised regarding multiple
ERROR messages.

Abridged for space. hopefully without losing the main points of the
authors, so far:

Josh Berkus:

My $0.019999999999998: terminating the connection is the
preferred behavior.

Tom Lane:

FWIW, I think aborting the transaction is probably better,
especially if the patch is designed to do nothing to
already-aborted transactions.  If the client is still there, it
will see the abort as a failure in its next query, which is less
likely to confuse it completely than a connection loss.  (I
think, anyway.)

Álvaro Herrera:

I think if we want this at all it should abort the xact, not drop
the connection.

Andrew Dunstan:

[quotes Tom's argument]
Yes, I had the same thought.

David G Johnston:

Default to dropping the connection but give the
usersadministrators the capability to decide for themselves?

Robert Haas:

Assuming that the state of play hasn't changed in some way I'm
unaware of since 2010, I think the best argument for FATAL here
is that it's what we can implement.  I happen to think it's
better anyway, because the cases I've seen where this would
actually be useful involve badly-written applications that are
not under the same administrative control as the database and
supposedly have built-in connection poolers, except sometimes
they seem to forget about some of the connections they themselves
opened.  The DBAs can't make the app developers fix the app; they
just have to try to survive.  Aborting the transaction is a step
in the right direction but since the guy at the other end of the
connection is actually or in effect dead, that just leaves you
with an eternally idle connection.

Tom Lane (reprise):

I'm not sure whether cancel-transaction behavior is enough better
than cancel-session to warrant extra effort here.

Andres Freund:

[quotes Tom's latest (above)]
I don't think idle_in_transaction_timeout is worth this, but if
we could implement it we could finally have recovery conflicts
against idle in txn sessions not use FATAL...

Kevin Grittner:

Personally, where I have seen a use for this, treating it as
FATAL would be better anyway

A few ideas were put forward on how a much more complicated patch
could allow transaction rollback with an ERROR work acceptably, but
I think that would probably need to be a new patch in a later CF,
so that is omitted here.

So, can we agree to use FATAL to terminate the connection?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Josh Berkus
josh@agliodbs.com
In reply to: Vik Fearing (#1)
Re: idle_in_transaction_timeout

On 06/18/2014 11:50 AM, Kevin Grittner wrote:

The first thing is that I don't think a delay between the BEGIN and
the SELECT should cause a timeout to trigger, but more importantly
there should not be two ERROR responses to one SELECT statement.

I do think a delay between BEGIN and SELECT should trigger the timeout.
There are plenty of badly-written applications which "auto-begin", that
is, they issue a "BEGIN;" immediately after every "COMMIT;" whether or
not there's any additional work to do. This is a major source of IIT
and the timeout should not ignore it.

I'm inclined to abandon the ERROR approach as not worth the effort
and fragility, and focus on v1 of the patch. If we can't get to
consensus on that, I think that this patch should be flagged
"Returned with Feedback", noting that any follow-up version
requires some way to deal with the issues raised regarding multiple
ERROR messages.

+1

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#18)
Re: idle_in_transaction_timeout

Josh Berkus <josh@agliodbs.com> writes:

There are plenty of badly-written applications which "auto-begin", that
is, they issue a "BEGIN;" immediately after every "COMMIT;" whether or
not there's any additional work to do. This is a major source of IIT
and the timeout should not ignore it.

Nonsense. We explicitly don't do anything useful until the first actual
command arrives, precisely to avoid that problem.

It might be that we should slap such apps' wrists anyway, but given
that we've gone to the trouble of working around the behavior at the
system structural level, I'd be inclined to say not. What you'd be
doing is preventing people who have to deal with such apps from using
the timeout in any useful fashion.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Josh Berkus
josh@agliodbs.com
In reply to: Vik Fearing (#1)
Re: idle_in_transaction_timeout

On 06/18/2014 12:32 PM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

There are plenty of badly-written applications which "auto-begin", that
is, they issue a "BEGIN;" immediately after every "COMMIT;" whether or
not there's any additional work to do. This is a major source of IIT
and the timeout should not ignore it.

Nonsense. We explicitly don't do anything useful until the first actual
command arrives, precisely to avoid that problem.

Oh, we don't allocate a snapshot? If not, then no objection here.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#21)
#23Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#13)
#24Marko Tiikkaja
marko@joh.to
In reply to: Josh Berkus (#23)
#25Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#13)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#23)
#27David G. Johnston
david.g.johnston@gmail.com
In reply to: Josh Berkus (#25)
#28Pavel Stehule
pavel.stehule@gmail.com
In reply to: Josh Berkus (#23)
#29Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Pavel Stehule (#28)
#30Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Abhijit Menon-Sen (#29)
#31Vik Fearing
vik@postgresfriends.org
In reply to: Kevin Grittner (#30)
#32Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Kevin Grittner (#30)
#33Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Vik Fearing (#31)
#34David G. Johnston
david.g.johnston@gmail.com
In reply to: Abhijit Menon-Sen (#33)
#35Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#19)
#36Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Josh Berkus (#35)
#37Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#19)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#37)
#39Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#19)
#40Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Josh Berkus (#39)
#41Andrew Dunstan
andrew@dunslane.net
In reply to: Josh Berkus (#39)
#42Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andrew Dunstan (#41)
#43Vik Fearing
vik@postgresfriends.org
In reply to: Kevin Grittner (#42)
#44Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Vik Fearing (#43)
#45Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Abhijit Menon-Sen (#44)
#46Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#42)
#47Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andres Freund (#46)
#48Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#47)
#49Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andres Freund (#48)
#50Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andres Freund (#48)
#51David G. Johnston
david.g.johnston@gmail.com
In reply to: Kevin Grittner (#50)
#52Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: David G. Johnston (#51)
#53Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#50)
#54Fujii Masao
masao.fujii@gmail.com
In reply to: Andres Freund (#53)
#55Vik Fearing
vik@postgresfriends.org
In reply to: Andres Freund (#48)
#56Andres Freund
andres@anarazel.de
In reply to: Fujii Masao (#54)
#57Andres Freund
andres@anarazel.de
In reply to: Vik Fearing (#55)
#58Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#37)
#59Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andres Freund (#56)
#60David G. Johnston
david.g.johnston@gmail.com
In reply to: Kevin Grittner (#59)
#61David G. Johnston
david.g.johnston@gmail.com
In reply to: Kevin Grittner (#59)
#62Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#59)
#63Vik Fearing
vik@postgresfriends.org
In reply to: Kevin Grittner (#45)
#64David G. Johnston
david.g.johnston@gmail.com
In reply to: Vik Fearing (#63)
#65Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: David G. Johnston (#64)
#66Vik Fearing
vik@postgresfriends.org
In reply to: David G. Johnston (#64)
#67Robert Haas
robertmhaas@gmail.com
In reply to: Vik Fearing (#63)
#68Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#67)
#69David G. Johnston
david.g.johnston@gmail.com
In reply to: Robert Haas (#67)
#70Vik Fearing
vik@postgresfriends.org
In reply to: Robert Haas (#67)
#71Robert Haas
robertmhaas@gmail.com
In reply to: Vik Fearing (#70)
#72David G. Johnston
david.g.johnston@gmail.com
In reply to: Robert Haas (#71)
#73Josh Berkus
josh@agliodbs.com
In reply to: Andrew Dunstan (#41)
#74Vik Fearing
vik@postgresfriends.org
In reply to: Josh Berkus (#73)
#75Pavel Stehule
pavel.stehule@gmail.com
In reply to: Josh Berkus (#73)
#76Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#35)
#77Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#76)
#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#73)
#79Vik Fearing
vik@postgresfriends.org
In reply to: Tom Lane (#78)
#80Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#78)
#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#80)
#82Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#78)
#83Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#81)
#84Josh Berkus
josh@agliodbs.com
In reply to: Andrew Dunstan (#41)
#85Fujii Masao
masao.fujii@gmail.com
In reply to: Vik Fearing (#43)
#86Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#85)
#87Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#86)
#88Vik Fearing
vik@postgresfriends.org
In reply to: Fujii Masao (#87)
#89Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#86)
#90Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#89)
#91Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#90)
#92Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#90)
#93Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#92)
#94Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#93)
#95Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#94)
#96Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#91)
#97Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#95)
#98Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#96)
#99Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#98)
#100Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#99)
#101Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#100)
#102Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#78)