psql or pgbouncer bug?
Hi,
can anyone tell me how this could happen, please?
database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK
The update was commited to database. This was psql 8.4 connectig to 8.2
server through pgbouncer 1.3.
It's not reproducible for me :-(
Any ideas?
Thanks,
Kuba
Jakub Ouhrabka <kuba@comgate.cz> writes:
can anyone tell me how this could happen, please?
database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK
What's in the postmaster log?
regards, tom lane
On 05/21/2010 11:19 AM, Jakub Ouhrabka wrote:
Hi,
can anyone tell me how this could happen, please?
database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACKThe update was commited to database. This was psql 8.4 connectig to 8.2
server through pgbouncer 1.3.It's not reproducible for me :-(
Any ideas?
1. you connect to pgbouncer using psql
2: you execute the query and something (firewall whatever) drops the
connection between psql and pgbouncer while the one between pgbouncer
and the backend stays alive
3. psql notices the lost connection and reconnects and you end up on
another backend session (or the same one that was just RESET ALL; by
pgbouncer after the UPDATE completed)
4. the ROLLBACK; does nothing because the pooled connection you are now
connected is either a different one or got reset after the connection
dropped.
Stefan
What's in the postmaster log?
Excerpts from both postmater and pgbouncer logs see below. I can't find
nothing more interesting, no errors.
There is difference between server clocks, pgbouncer is 40s behind
postgresql server, will fix...
Kuba
Postmaster:
2010-05-21 15:58:58 CEST 20436 1 [unknown]@ [unknown] LOG: connection
received: host=x.x.x.x port=59694
2010-05-21 15:58:58 CEST 20436 2 user@x.x.x.x(59694) database LOG:
connection authorized: user=user database=database
2010-05-21 16:00:43 CEST 20436 3 user@x.x.x.x(59694) database LOG:
unexpected EOF on client connection
2010-05-21 16:00:44 CEST 20436 4 user@x.x.x.x(59694) database LOG:
disconnection: session time: 0:01:46.642 user=user database=database
host=x.x.x.x port=59694
2010-05-21 16:00:56 CEST 20643 1 [unknown]@ [unknown] LOG: connection
received: host=x.x.x.x port=59901
2010-05-21 16:00:56 CEST 20643 2 user@x.x.x.x(59901) database LOG:
connection authorized: user=user database=database
2010-05-21 16:01:07 CEST 20643 3 user@x.x.x.x(59901) database WARNING:
there is no transaction in progress
Pgbouncer:
2010-05-21 15:58:16.905 30404 LOG C-0x2ab248086300:
database/user@x.x.x.x:35275 login successful: db=database user=user
2010-05-21 15:58:18.698 30404 LOG S-0x2ab2481306e0:
database/user@x.x.x.x:5432 new connection to server
[restart]
2010-05-21 16:00:04.430 30404 LOG Got SIGTERM, fast exit
2010-05-21 16:00:08.104 32225 LOG File descriptor limit: 65535
(H:65535), max_client_conn: 6000, max fds possible: 60642010
2010-05-21 16:00:08.105 32227 LOG listening on *:5432
2010-05-21 16:00:08.105 32227 LOG listening on unix:/tmp/.s.PGSQL.5432
2010-05-21 16:00:17.286 32227 LOG C-0xa65bbf8:
database/user@x.x.x.x:51814 login successful: db=database user=user
2010-05-21 16:00:17.286 32227 LOG S-0xa6799e0:
database/user@x.x.x.x:5432 new connection to server
Dne 21.5.2010 17:28, Tom Lane napsal(a):
Show quoted text
Jakub Ouhrabka<kuba@comgate.cz> writes:
can anyone tell me how this could happen, please?
database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACKWhat's in the postmaster log?
regards, tom lane
Hi Stefan,
thanks - but I don't understand how could the BEGIN; UPDATE xxx; be
committed to database without explicit COMMIT and how could psql report
"UPDATE 153" after message "The connection was reset". This puzzles me...
Kuba
Dne 21.5.2010 17:33, Stefan Kaltenbrunner napsal(a):
Show quoted text
On 05/21/2010 11:19 AM, Jakub Ouhrabka wrote:
Hi,
can anyone tell me how this could happen, please?
database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACKThe update was commited to database. This was psql 8.4 connectig to 8.2
server through pgbouncer 1.3.It's not reproducible for me :-(
Any ideas?
1. you connect to pgbouncer using psql
2: you execute the query and something (firewall whatever) drops the
connection between psql and pgbouncer while the one between pgbouncer
and the backend stays alive
3. psql notices the lost connection and reconnects and you end up on
another backend session (or the same one that was just RESET ALL; by
pgbouncer after the UPDATE completed)
4. the ROLLBACK; does nothing because the pooled connection you are now
connected is either a different one or got reset after the connection
dropped.Stefan
On 05/21/2010 12:03 PM, Jakub Ouhrabka wrote:
Hi Stefan,
thanks - but I don't understand how could the BEGIN; UPDATE xxx; be
committed to database without explicit COMMIT and how could psql report
"UPDATE 153" after message "The connection was reset". This puzzles me...
hmm yeah that is indeed a tad weird - are you actually using as
pool_mode and server_reset_query?
Stefan
hmm yeah that is indeed a tad weird - are you actually using as
pool_mode and server_reset_query?
pool_mode = session
server_reset_query = RESET ALL; SET SESSION AUTHORIZATION DEFAULT;
UNLISTEN *;
Kuba
On 05/21/2010 12:13 PM, Jakub Ouhrabka wrote:
hmm yeah that is indeed a tad weird - are you actually using as
pool_mode and server_reset_query?pool_mode = session
server_reset_query = RESET ALL; SET SESSION AUTHORIZATION DEFAULT;
UNLISTEN *;
hmm - and you are really sure that the update got commited in the
end(even if you got the "UPDATE 153" it should have been rollbacked as
soon as the connection got dropped)?
If yes I'm puzzled as well on what happened here.
Stefan
Jakub Ouhrabka <kuba@comgate.cz> writes:
What's in the postmaster log?
Excerpts from both postmater and pgbouncer logs see below. I can't find
nothing more interesting, no errors.
Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the pgbouncer
folk.
regards, tom lane
Tom:
Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the
pgbouncer folk.
The restart of pgbouncer was intentional, although made by someone else,
so the disconnect is ok. What's not ok is the "UPDATE 153" message after
message with connection lost and the fact that the UPDATE was committed
to database without explicit COMMIT. Maybe pgbouncer issued the commit?
Stefan:
hmm - and you are really sure that the update got commited in the
end(even if you got the "UPDATE 153" it should have been rollbacked as
soon as the connection got dropped)?
Quite sure. I've seen it on my colleague's screen (afterwards) and saved
it to text file. Before the "BEGIN; UPDATE" there was "SELECT * FROM
table" which showed the state of table and just after the unsuccessful
"ROLLBACK" the "SELECT * from table" was issued again and showed updated
table. No one else worked with the table and the change is definitely
committed. The .psql_history contains exactly this as well.
Kuba
Jakub Ouhrabka <jakub.ouhrabka@comgate.cz> writes:
Tom:
Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the
pgbouncer folk.
The restart of pgbouncer was intentional, although made by someone else,
so the disconnect is ok. What's not ok is the "UPDATE 153" message after
message with connection lost and the fact that the UPDATE was committed
to database without explicit COMMIT. Maybe pgbouncer issued the commit?
The message ordering doesn't surprise me a huge amount, but the fact
that the update got committed is definitely surprising. I think
pgbouncer has to have done something strange there. We need to pull
those folk into the discussion.
regards, tom lane
On 05/21/2010 01:32 PM, Tom Lane wrote:
Jakub Ouhrabka <jakub.ouhrabka@comgate.cz> writes:
Tom:
Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the
pgbouncer folk.The restart of pgbouncer was intentional, although made by someone else,
so the disconnect is ok. What's not ok is the "UPDATE 153" message after
message with connection lost and the fact that the UPDATE was committed
to database without explicit COMMIT. Maybe pgbouncer issued the commit?The message ordering doesn't surprise me a huge amount, but the fact
that the update got committed is definitely surprising. I think
pgbouncer has to have done something strange there. We need to pull
those folk into the discussion.
yeah - I don't think pgbouncer would cause that behaviour on its own
given the provided information so I would kinda suspect that the update
was in fact never commited though that is not what the OP saw...
Stefan
Hi Jakub,
On 21/05/10 16:19, Jakub Ouhrabka wrote:
can anyone tell me how this could happen, please?
database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK
I don't know anything about psql internals, but at a guess the sequence
is this:
* 'begin' is sent to server
* Connection is dropped
* Connection is reset, but 'begin' is not resent
* Next statement (the update) is sent to the server, executes immediately
* Rollback gives error since there was no corresponding begin
Seems to be trivially easy to reproduce by connecting via psql, then
killing that connection before issuing the 'begin; update' sequence
(against postgres directly, no pgbouncer needed). If anything, it's an
issue with psql settings? Maybe it should stop on connection drop rather
than attempting reconnect and continuing with further statements.
Tom
2010/5/22 Tom Molesworth <tom@audioboundary.com>:
Hi Jakub,
On 21/05/10 16:19, Jakub Ouhrabka wrote:
can anyone tell me how this could happen, please?
database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK
<snip>
Seems to be trivially easy to reproduce by connecting via psql, then killing
that connection before issuing the 'begin; update' sequence (against
postgres directly, no pgbouncer needed). If anything, it's an issue with
psql settings? Maybe it should stop on connection drop rather than
attempting reconnect and continuing with further statements.
Does PostgreSQL log anything more detailed? What if you update some
other table similarly? It looks like some sort of data corruption on
the table you're updating.
--
Joshua Tolley / eggyknap
End Point Corporation
Hi Joshua,
On 23/05/10 00:45, Joshua Tolley wrote:
2010/5/22 Tom Molesworth<tom@audioboundary.com>:
Seems to be trivially easy to reproduce by connecting via psql, then killing
that connection before issuing the 'begin; update' sequence (against
postgres directly, no pgbouncer needed). If anything, it's an issue with
psql settings? Maybe it should stop on connection drop rather than
attempting reconnect and continuing with further statements.Does PostgreSQL log anything more detailed? What if you update some
other table similarly? It looks like some sort of data corruption on
the table you're updating.
Surely this is normal, expected behaviour - exactly the same as you'd
get from the mysql commandline client, for example? Perhaps my
explanation wasn't clear - here's an example session:
$ psql
psql (8.4.3, server 8.4.2)
Type "help" for help.
tom=# \set
AUTOCOMMIT = 'on'
PROMPT1 = '%/%R%# '
PROMPT2 = '%/%R%# '
PROMPT3 = '>> '
VERBOSITY = 'default'
VERSION = 'PostgreSQL 8.4.3 on i486-pc-linux-gnu, compiled by GCC
gcc-4.4.real (Debian 4.4.3-3) 4.4.3, 32-bit'
DBNAME = 'tom'
USER = 'tom'
HOST = '/var/run/postgresql'
PORT = '5432'
ENCODING = 'UTF8'
tom=# create table test1 (id int, name varchar);
CREATE TABLE
tom=# insert into test1 values (1,'test');
INSERT 0 1
tom=# select name from test1;
name
------
test
(1 row)
At this point, identify the psql session via 'select * from
pg_stat_activity' or other means (in a different psql session), and
terminate the connection (kill PID). Now issue the following commands:
tom=# begin; update test1 set name = 'updated';
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 1
tom=# rollback;
NOTICE: there is no transaction in progress
ROLLBACK
tom=# select name from test1;
name
---------
updated
(1 row)
Since the default autocommit setting is enabled (at least under Ubuntu +
Debian psql 8.4 variants that I've tried), then on reconnect autocommit
is also enabled, and there is no 'begin', that never made it to the
server - the next statement is issued and committed immediately.
A simple way to avoid this in psql is to use '\set autocommit false'
instead of using 'begin' under autocommit - at least, that's always the
way I've done transactions there, issuing a separate 'begin' that could
get lost on connection drop just seems too risky to be worth considering.
Note that I'm not the original submitter - so I could be missing the
point entirely here!
Tom
On Sat, May 22, 2010 at 5:56 PM, Tom Molesworth <tom@audioboundary.com> wrote:
Surely this is normal, expected behaviour - exactly the same as you'd get
from the mysql commandline client, for example?
<snip>
Note that I'm not the original submitter - so I could be missing the point
entirely here!
If you were missing it, you weren't the only one :) I misplaced some
of the rest of this thread, and realize now that the question I was
asking was already answered.
- Josh
Tom Molesworth <tom@audioboundary.com> writes:
I don't know anything about psql internals, but at a guess the sequence
is this:
* 'begin' is sent to server
* Connection is dropped
* Connection is reset, but 'begin' is not resent
* Next statement (the update) is sent to the server, executes immediately
* Rollback gives error since there was no corresponding begin
Seems to be trivially easy to reproduce by connecting via psql, then
killing that connection before issuing the 'begin; update' sequence
(against postgres directly, no pgbouncer needed).
Yeah, confirmed here. A simple example is:
regression=# select 2+2; select 4+4;
?column?
----------
4
(1 row)
?column?
----------
8
(1 row)
If I now kill -9 the connected backend and repeat, I get this instead:
regression=# select 2+2; select 4+4;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
?column?
----------
8
(1 row)
So that explains Jakub's observed behavior without having to make any
strenuous assumptions about the connection being dropped at just the
right instant --- any time while he was typing the line would do it.
If anything, it's an
issue with psql settings? Maybe it should stop on connection drop rather
than attempting reconnect and continuing with further statements.
The auto-reconnect behavior is long-established and desirable. What's
not desirable is continuing with any statements remaining on the same
line, I think. We need to flush the input buffer on reconnect.
regards, tom lane
The auto-reconnect behavior is long-established and desirable. What's
not desirable is continuing with any statements remaining on the same
line, I think. We need to flush the input buffer on reconnect.
So if I understand it correctly, if I need correct transaction behaviour
in psql even in case of disconnection the only safe way is to use one
statement per line.
Is this correct?
Thanks,
Kuba
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160
The auto-reconnect behavior is long-established and desirable. What's
not desirable is continuing with any statements remaining on the same
line, I think. We need to flush the input buffer on reconnect.
So if I understand it correctly, if I need correct transaction behaviour
in psql even in case of disconnection the only safe way is to use one
statement per line.Is this correct?
Yes, that is correct. Pretty big gotcha.
- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005240925
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----
iEYEAREDAAYFAkv6fncACgkQvJuQZxSWSsipbQCg3Cn6Hh4Uk9i2TwaKNgzB1Xef
apIAoLiNoJT4pjtA4xaZXL11XdgUYwph
=MF9l
-----END PGP SIGNATURE-----
Jakub Ouhrabka <kuba@comgate.cz> writes:
So if I understand it correctly, if I need correct transaction behaviour
in psql even in case of disconnection the only safe way is to use one
statement per line.
In existing releases, yes. Having put the BEGIN and UPDATE all on one
line was a necessary precondition for this problem.
regards, tom lane