psql or pgbouncer bug?

Started by Jakub Ouhrabkaalmost 16 years ago22 messagesbugs
Jump to latest
#1Jakub Ouhrabka
kuba@comgate.cz

Hi,

can anyone tell me how this could happen, please?

database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK

The update was commited to database. This was psql 8.4 connectig to 8.2
server through pgbouncer 1.3.

It's not reproducible for me :-(

Any ideas?

Thanks,

Kuba

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jakub Ouhrabka (#1)
Re: psql or pgbouncer bug?

Jakub Ouhrabka <kuba@comgate.cz> writes:

can anyone tell me how this could happen, please?

database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK

What's in the postmaster log?

regards, tom lane

#3Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Jakub Ouhrabka (#1)
Re: psql or pgbouncer bug?

On 05/21/2010 11:19 AM, Jakub Ouhrabka wrote:

Hi,

can anyone tell me how this could happen, please?

database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK

The update was commited to database. This was psql 8.4 connectig to 8.2
server through pgbouncer 1.3.

It's not reproducible for me :-(

Any ideas?

1. you connect to pgbouncer using psql
2: you execute the query and something (firewall whatever) drops the
connection between psql and pgbouncer while the one between pgbouncer
and the backend stays alive
3. psql notices the lost connection and reconnects and you end up on
another backend session (or the same one that was just RESET ALL; by
pgbouncer after the UPDATE completed)
4. the ROLLBACK; does nothing because the pooled connection you are now
connected is either a different one or got reset after the connection
dropped.

Stefan

#4Jakub Ouhrabka
kuba@comgate.cz
In reply to: Tom Lane (#2)
Re: psql or pgbouncer bug?

What's in the postmaster log?

Excerpts from both postmater and pgbouncer logs see below. I can't find
nothing more interesting, no errors.

There is difference between server clocks, pgbouncer is 40s behind
postgresql server, will fix...

Kuba

Postmaster:

2010-05-21 15:58:58 CEST 20436 1 [unknown]@ [unknown] LOG: connection
received: host=x.x.x.x port=59694

2010-05-21 15:58:58 CEST 20436 2 user@x.x.x.x(59694) database LOG:
connection authorized: user=user database=database

2010-05-21 16:00:43 CEST 20436 3 user@x.x.x.x(59694) database LOG:
unexpected EOF on client connection

2010-05-21 16:00:44 CEST 20436 4 user@x.x.x.x(59694) database LOG:
disconnection: session time: 0:01:46.642 user=user database=database
host=x.x.x.x port=59694

2010-05-21 16:00:56 CEST 20643 1 [unknown]@ [unknown] LOG: connection
received: host=x.x.x.x port=59901

2010-05-21 16:00:56 CEST 20643 2 user@x.x.x.x(59901) database LOG:
connection authorized: user=user database=database

2010-05-21 16:01:07 CEST 20643 3 user@x.x.x.x(59901) database WARNING:
there is no transaction in progress

Pgbouncer:

2010-05-21 15:58:16.905 30404 LOG C-0x2ab248086300:
database/user@x.x.x.x:35275 login successful: db=database user=user

2010-05-21 15:58:18.698 30404 LOG S-0x2ab2481306e0:
database/user@x.x.x.x:5432 new connection to server

[restart]
2010-05-21 16:00:04.430 30404 LOG Got SIGTERM, fast exit
2010-05-21 16:00:08.104 32225 LOG File descriptor limit: 65535
(H:65535), max_client_conn: 6000, max fds possible: 60642010
2010-05-21 16:00:08.105 32227 LOG listening on *:5432
2010-05-21 16:00:08.105 32227 LOG listening on unix:/tmp/.s.PGSQL.5432

2010-05-21 16:00:17.286 32227 LOG C-0xa65bbf8:
database/user@x.x.x.x:51814 login successful: db=database user=user

2010-05-21 16:00:17.286 32227 LOG S-0xa6799e0:
database/user@x.x.x.x:5432 new connection to server

Dne 21.5.2010 17:28, Tom Lane napsal(a):

Show quoted text

Jakub Ouhrabka<kuba@comgate.cz> writes:

can anyone tell me how this could happen, please?

database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK

What's in the postmaster log?

regards, tom lane

#5Jakub Ouhrabka
kuba@comgate.cz
In reply to: Stefan Kaltenbrunner (#3)
Re: psql or pgbouncer bug?

Hi Stefan,

thanks - but I don't understand how could the BEGIN; UPDATE xxx; be
committed to database without explicit COMMIT and how could psql report
"UPDATE 153" after message "The connection was reset". This puzzles me...

Kuba

Dne 21.5.2010 17:33, Stefan Kaltenbrunner napsal(a):

Show quoted text

On 05/21/2010 11:19 AM, Jakub Ouhrabka wrote:

Hi,

can anyone tell me how this could happen, please?

database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK

The update was commited to database. This was psql 8.4 connectig to 8.2
server through pgbouncer 1.3.

It's not reproducible for me :-(

Any ideas?

1. you connect to pgbouncer using psql
2: you execute the query and something (firewall whatever) drops the
connection between psql and pgbouncer while the one between pgbouncer
and the backend stays alive
3. psql notices the lost connection and reconnects and you end up on
another backend session (or the same one that was just RESET ALL; by
pgbouncer after the UPDATE completed)
4. the ROLLBACK; does nothing because the pooled connection you are now
connected is either a different one or got reset after the connection
dropped.

Stefan

#6Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Jakub Ouhrabka (#5)
Re: psql or pgbouncer bug?

On 05/21/2010 12:03 PM, Jakub Ouhrabka wrote:

Hi Stefan,

thanks - but I don't understand how could the BEGIN; UPDATE xxx; be
committed to database without explicit COMMIT and how could psql report
"UPDATE 153" after message "The connection was reset". This puzzles me...

hmm yeah that is indeed a tad weird - are you actually using as
pool_mode and server_reset_query?

Stefan

#7Jakub Ouhrabka
kuba@comgate.cz
In reply to: Stefan Kaltenbrunner (#6)
Re: psql or pgbouncer bug?

hmm yeah that is indeed a tad weird - are you actually using as
pool_mode and server_reset_query?

pool_mode = session

server_reset_query = RESET ALL; SET SESSION AUTHORIZATION DEFAULT;
UNLISTEN *;

Kuba

#8Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Jakub Ouhrabka (#7)
Re: psql or pgbouncer bug?

On 05/21/2010 12:13 PM, Jakub Ouhrabka wrote:

hmm yeah that is indeed a tad weird - are you actually using as
pool_mode and server_reset_query?

pool_mode = session

server_reset_query = RESET ALL; SET SESSION AUTHORIZATION DEFAULT;
UNLISTEN *;

hmm - and you are really sure that the update got commited in the
end(even if you got the "UPDATE 153" it should have been rollbacked as
soon as the connection got dropped)?
If yes I'm puzzled as well on what happened here.

Stefan

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jakub Ouhrabka (#4)
Re: psql or pgbouncer bug?

Jakub Ouhrabka <kuba@comgate.cz> writes:

What's in the postmaster log?

Excerpts from both postmater and pgbouncer logs see below. I can't find
nothing more interesting, no errors.

Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the pgbouncer
folk.

regards, tom lane

#10Jakub Ouhrabka
jakub.ouhrabka@comgate.cz
In reply to: Tom Lane (#9)
Re: psql or pgbouncer bug?

Tom:

Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the
pgbouncer folk.

The restart of pgbouncer was intentional, although made by someone else,
so the disconnect is ok. What's not ok is the "UPDATE 153" message after
message with connection lost and the fact that the UPDATE was committed
to database without explicit COMMIT. Maybe pgbouncer issued the commit?

Stefan:

hmm - and you are really sure that the update got commited in the
end(even if you got the "UPDATE 153" it should have been rollbacked as
soon as the connection got dropped)?

Quite sure. I've seen it on my colleague's screen (afterwards) and saved
it to text file. Before the "BEGIN; UPDATE" there was "SELECT * FROM
table" which showed the state of table and just after the unsuccessful
"ROLLBACK" the "SELECT * from table" was issued again and showed updated
table. No one else worked with the table and the change is definitely
committed. The .psql_history contains exactly this as well.

Kuba

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jakub Ouhrabka (#10)
Re: psql or pgbouncer bug?

Jakub Ouhrabka <jakub.ouhrabka@comgate.cz> writes:

Tom:

Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the
pgbouncer folk.

The restart of pgbouncer was intentional, although made by someone else,
so the disconnect is ok. What's not ok is the "UPDATE 153" message after
message with connection lost and the fact that the UPDATE was committed
to database without explicit COMMIT. Maybe pgbouncer issued the commit?

The message ordering doesn't surprise me a huge amount, but the fact
that the update got committed is definitely surprising. I think
pgbouncer has to have done something strange there. We need to pull
those folk into the discussion.

regards, tom lane

#12Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Tom Lane (#11)
Re: psql or pgbouncer bug?

On 05/21/2010 01:32 PM, Tom Lane wrote:

Jakub Ouhrabka <jakub.ouhrabka@comgate.cz> writes:

Tom:

Looks like the disconnect was because pgbouncer restarted. If that
wasn't supposed to happen then you should take it up with the
pgbouncer folk.

The restart of pgbouncer was intentional, although made by someone else,
so the disconnect is ok. What's not ok is the "UPDATE 153" message after
message with connection lost and the fact that the UPDATE was committed
to database without explicit COMMIT. Maybe pgbouncer issued the commit?

The message ordering doesn't surprise me a huge amount, but the fact
that the update got committed is definitely surprising. I think
pgbouncer has to have done something strange there. We need to pull
those folk into the discussion.

yeah - I don't think pgbouncer would cause that behaviour on its own
given the provided information so I would kinda suspect that the update
was in fact never commited though that is not what the OP saw...

Stefan

#13Tom Molesworth
tom@audioboundary.com
In reply to: Jakub Ouhrabka (#1)
Re: psql or pgbouncer bug?

Hi Jakub,

On 21/05/10 16:19, Jakub Ouhrabka wrote:

can anyone tell me how this could happen, please?

database=# begin; update table set col = 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING: there is no transaction in progress
ROLLBACK

I don't know anything about psql internals, but at a guess the sequence
is this:
* 'begin' is sent to server
* Connection is dropped
* Connection is reset, but 'begin' is not resent
* Next statement (the update) is sent to the server, executes immediately
* Rollback gives error since there was no corresponding begin

Seems to be trivially easy to reproduce by connecting via psql, then
killing that connection before issuing the 'begin; update' sequence
(against postgres directly, no pgbouncer needed). If anything, it's an
issue with psql settings? Maybe it should stop on connection drop rather
than attempting reconnect and continuing with further statements.

Tom

#14Joshua Tolley
eggyknap@gmail.com
In reply to: Tom Molesworth (#13)
Re: psql or pgbouncer bug?

2010/5/22 Tom Molesworth <tom@audioboundary.com>:

Hi Jakub,

On 21/05/10 16:19, Jakub Ouhrabka wrote:

can anyone tell me how this could happen, please?

database=# begin; update table set col = 100;
server closed the connection unexpectedly
      This probably means the server terminated abnormally
      before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 153
database=# ROLLBACK ;
WARNING:  there is no transaction in progress
ROLLBACK

<snip>

Seems to be trivially easy to reproduce by connecting via psql, then killing
that connection before issuing the 'begin; update' sequence (against
postgres directly, no pgbouncer needed). If anything, it's an issue with
psql settings? Maybe it should stop on connection drop rather than
attempting reconnect and continuing with further statements.

Does PostgreSQL log anything more detailed? What if you update some
other table similarly? It looks like some sort of data corruption on
the table you're updating.

--
Joshua Tolley / eggyknap
End Point Corporation

#15Tom Molesworth
tom@audioboundary.com
In reply to: Joshua Tolley (#14)
Re: psql or pgbouncer bug?

Hi Joshua,

On 23/05/10 00:45, Joshua Tolley wrote:

2010/5/22 Tom Molesworth<tom@audioboundary.com>:

Seems to be trivially easy to reproduce by connecting via psql, then killing
that connection before issuing the 'begin; update' sequence (against
postgres directly, no pgbouncer needed). If anything, it's an issue with
psql settings? Maybe it should stop on connection drop rather than
attempting reconnect and continuing with further statements.

Does PostgreSQL log anything more detailed? What if you update some
other table similarly? It looks like some sort of data corruption on
the table you're updating.

Surely this is normal, expected behaviour - exactly the same as you'd
get from the mysql commandline client, for example? Perhaps my
explanation wasn't clear - here's an example session:

$ psql
psql (8.4.3, server 8.4.2)
Type "help" for help.

tom=# \set
AUTOCOMMIT = 'on'
PROMPT1 = '%/%R%# '
PROMPT2 = '%/%R%# '
PROMPT3 = '>> '
VERBOSITY = 'default'
VERSION = 'PostgreSQL 8.4.3 on i486-pc-linux-gnu, compiled by GCC
gcc-4.4.real (Debian 4.4.3-3) 4.4.3, 32-bit'
DBNAME = 'tom'
USER = 'tom'
HOST = '/var/run/postgresql'
PORT = '5432'
ENCODING = 'UTF8'
tom=# create table test1 (id int, name varchar);
CREATE TABLE
tom=# insert into test1 values (1,'test');
INSERT 0 1
tom=# select name from test1;
name
------
test
(1 row)

At this point, identify the psql session via 'select * from
pg_stat_activity' or other means (in a different psql session), and
terminate the connection (kill PID). Now issue the following commands:

tom=# begin; update test1 set name = 'updated';
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
UPDATE 1
tom=# rollback;
NOTICE: there is no transaction in progress
ROLLBACK
tom=# select name from test1;
name
---------
updated
(1 row)

Since the default autocommit setting is enabled (at least under Ubuntu +
Debian psql 8.4 variants that I've tried), then on reconnect autocommit
is also enabled, and there is no 'begin', that never made it to the
server - the next statement is issued and committed immediately.

A simple way to avoid this in psql is to use '\set autocommit false'
instead of using 'begin' under autocommit - at least, that's always the
way I've done transactions there, issuing a separate 'begin' that could
get lost on connection drop just seems too risky to be worth considering.

Note that I'm not the original submitter - so I could be missing the
point entirely here!

Tom

#16Joshua Tolley
eggyknap@gmail.com
In reply to: Tom Molesworth (#15)
Re: psql or pgbouncer bug?

On Sat, May 22, 2010 at 5:56 PM, Tom Molesworth <tom@audioboundary.com> wrote:

Surely this is normal, expected behaviour - exactly the same as you'd get
from the mysql commandline client, for example?

<snip>

Note that I'm not the original submitter - so I could be missing the point
entirely here!

If you were missing it, you weren't the only one :) I misplaced some
of the rest of this thread, and realize now that the question I was
asking was already answered.

- Josh

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Molesworth (#13)
Re: psql or pgbouncer bug?

Tom Molesworth <tom@audioboundary.com> writes:

I don't know anything about psql internals, but at a guess the sequence
is this:
* 'begin' is sent to server
* Connection is dropped
* Connection is reset, but 'begin' is not resent
* Next statement (the update) is sent to the server, executes immediately
* Rollback gives error since there was no corresponding begin

Seems to be trivially easy to reproduce by connecting via psql, then
killing that connection before issuing the 'begin; update' sequence
(against postgres directly, no pgbouncer needed).

Yeah, confirmed here. A simple example is:

regression=# select 2+2; select 4+4;
?column?
----------
4
(1 row)

?column?
----------
8
(1 row)

If I now kill -9 the connected backend and repeat, I get this instead:

regression=# select 2+2; select 4+4;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
?column?
----------
8
(1 row)

So that explains Jakub's observed behavior without having to make any
strenuous assumptions about the connection being dropped at just the
right instant --- any time while he was typing the line would do it.

If anything, it's an
issue with psql settings? Maybe it should stop on connection drop rather
than attempting reconnect and continuing with further statements.

The auto-reconnect behavior is long-established and desirable. What's
not desirable is continuing with any statements remaining on the same
line, I think. We need to flush the input buffer on reconnect.

regards, tom lane

#18Jakub Ouhrabka
kuba@comgate.cz
In reply to: Tom Lane (#17)
Re: psql or pgbouncer bug?

The auto-reconnect behavior is long-established and desirable. What's
not desirable is continuing with any statements remaining on the same
line, I think. We need to flush the input buffer on reconnect.

So if I understand it correctly, if I need correct transaction behaviour
in psql even in case of disconnection the only safe way is to use one
statement per line.

Is this correct?

Thanks,

Kuba

#19Greg Sabino Mullane
greg@turnstep.com
In reply to: Jakub Ouhrabka (#18)
Re: psql or pgbouncer bug?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

The auto-reconnect behavior is long-established and desirable. What's
not desirable is continuing with any statements remaining on the same
line, I think. We need to flush the input buffer on reconnect.

So if I understand it correctly, if I need correct transaction behaviour
in psql even in case of disconnection the only safe way is to use one
statement per line.

Is this correct?

Yes, that is correct. Pretty big gotcha.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005240925
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkv6fncACgkQvJuQZxSWSsipbQCg3Cn6Hh4Uk9i2TwaKNgzB1Xef
apIAoLiNoJT4pjtA4xaZXL11XdgUYwph
=MF9l
-----END PGP SIGNATURE-----

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jakub Ouhrabka (#18)
Re: psql or pgbouncer bug?

Jakub Ouhrabka <kuba@comgate.cz> writes:

So if I understand it correctly, if I need correct transaction behaviour
in psql even in case of disconnection the only safe way is to use one
statement per line.

In existing releases, yes. Having put the BEGIN and UPDATE all on one
line was a necessary precondition for this problem.

regards, tom lane

#21Tom Molesworth
tom@audioboundary.com
In reply to: Jakub Ouhrabka (#18)
#22Bruce Momjian
bruce@momjian.us
In reply to: Tom Molesworth (#21)