BUG #6571: Postgres Kills 'Select 1' query randomly on hot standby databases

Started by Alex Matzingerabout 14 years ago7 messagesbugs
Jump to latest
#1Alex Matzinger
amatzinger@experts-exchange.com

The following bug has been logged on the website:

Bug reference: 6571
Logged by: Alex Matzinger
Email address: amatzinger@experts-exchange.com
PostgreSQL version: 9.1.2
Operating system: Freebsd 7.4
Description:

On a hot standby database, while the primary is being updated, Postgres will
randomly kill a process which is performing a "Select 1" command.

The error is this:
2012-04-02 13:36:13.269
PDT,"smxuser","smxprd1",39523,"127.0.0.1:57893",4f79ffad.9a63,1,"",2012-04-02
12:36:13 PDT,3/32,0,FATAL,40001,"terminating connection due to conflict with
recovery","User query might have needed to see row versions that must be
removed.","In a moment you should be able to reconnect to the database and
repeat your command.",,,,,,,""

We have 5 hot standby's set up, which all preform this SELECT 1, and
postgres kills them across all standby's.

There should never be a situation that SELECT 1 is in conflict with data, as
it it never using any table in the database.

We will be updating to 9.1.3 very soon, but i have not seen any
documentation that this issue is corrected, or even known.

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Alex Matzinger (#1)
Re: BUG #6571: Postgres Kills 'Select 1' query randomly on hot standby databases

On 03.04.2012 02:23, amatzinger@experts-exchange.com wrote:

On a hot standby database, while the primary is being updated, Postgres will
randomly kill a process which is performing a "Select 1" command.

The error is this:
2012-04-02 13:36:13.269
PDT,"smxuser","smxprd1",39523,"127.0.0.1:57893",4f79ffad.9a63,1,"",2012-04-02
12:36:13 PDT,3/32,0,FATAL,40001,"terminating connection due to conflict with
recovery","User query might have needed to see row versions that must be
removed.","In a moment you should be able to reconnect to the database and
repeat your command.",,,,,,,""

We have 5 hot standby's set up, which all preform this SELECT 1, and
postgres kills them across all standby's.

There should never be a situation that SELECT 1 is in conflict with data, as
it it never using any table in the database.

The system doesn't make a difference between queries like "SELECT 1"
that don't access any tables, and those that do. Even if "SELECT 1"
doesn't access any tables, a subsequent statement in the same
transaction might.

I'm assuming that those "SELECT 1"s were issued in transactions that had
been open for a long time, because you shouldn't get recovery conflicts
with very short transactions, in practice anyway.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#3Alex Matzinger
amatzinger@experts-exchange.com
In reply to: Heikki Linnakangas (#2)
Re: BUG #6571: Postgres Kills 'Select 1' query randomly on hot standby databases

The connection that is executing the SELECT 1 are generally open for 1-5
hours before they are killed. The specific connection only executes
SELECT 1. The transaction is simply BEGIN, and then SELECT 1's, no
other query is executed.

The updates we make to the primary database are quiet large, usually
several megabytes.

Alex

Show quoted text

On 4/2/2012 11:49 PM, Heikki Linnakangas wrote:

On 03.04.2012 02:23, amatzinger@experts-exchange.com wrote:

On a hot standby database, while the primary is being updated,
Postgres will
randomly kill a process which is performing a "Select 1" command.

The error is this:
2012-04-02 13:36:13.269
PDT,"smxuser","smxprd1",39523,"127.0.0.1:57893",4f79ffad.9a63,1,"",2012-04-02

12:36:13 PDT,3/32,0,FATAL,40001,"terminating connection due to
conflict with
recovery","User query might have needed to see row versions that must be
removed.","In a moment you should be able to reconnect to the
database and
repeat your command.",,,,,,,""

We have 5 hot standby's set up, which all preform this SELECT 1, and
postgres kills them across all standby's.

There should never be a situation that SELECT 1 is in conflict with
data, as
it it never using any table in the database.

The system doesn't make a difference between queries like "SELECT 1"
that don't access any tables, and those that do. Even if "SELECT 1"
doesn't access any tables, a subsequent statement in the same
transaction might.

I'm assuming that those "SELECT 1"s were issued in transactions that
had been open for a long time, because you shouldn't get recovery
conflicts with very short transactions, in practice anyway.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alex Matzinger (#3)
Re: BUG #6571: Postgres Kills 'Select 1' query randomly on hot standby databases

Alex Matzinger <amatzinger@experts-exchange.com> writes:

The connection that is executing the SELECT 1 are generally open for 1-5
hours before they are killed. The specific connection only executes
SELECT 1. The transaction is simply BEGIN, and then SELECT 1's, no
other query is executed.

Lose the "BEGIN" and it will probably work more nicely.

regards, tom lane

#5Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Alex Matzinger (#3)
Re: BUG #6571: Postgres Kills 'Select 1' query randomly on hot standby databases

Alex Matzinger <amatzinger@experts-exchange.com> wrote:

The connection that is executing the SELECT 1 are generally open
for 1-5 hours before they are killed. The specific connection
only executes SELECT 1. The transaction is simply BEGIN, and then
SELECT 1's, no other query is executed.

Don't do that.

In particular, never put more into a single database transaction
than is required for integrity; and there is no apparent reason why
running a series of "SELECT 1" statements needs to be in a single
transaction. (It's not blindingly obvious why you would want to do
it in general, but I can imagine it possibly being useful for
monitoring purposes.)

-Kevin

#6John R Pierce
pierce@hogranch.com
In reply to: Tom Lane (#4)
Re: BUG #6571: Postgres Kills 'Select 1' query randomly on hot standby databases

On 04/03/12 9:40 AM, Tom Lane wrote:

Alex Matzinger<amatzinger@experts-exchange.com> writes:

The connection that is executing the SELECT 1 are generally open for 1-5
hours before they are killed. The specific connection only executes
SELECT 1. The transaction is simply BEGIN, and then SELECT 1's, no
other query is executed.

Lose the "BEGIN" and it will probably work more nicely.

indeed,a 1-5 hour long transaction means VACUUM can't clean up anything
newer than the oldest active transaction, and thats not per database,
thats cluster-wide.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast

#7Alex Matzinger
amatzinger@experts-exchange.com
In reply to: John R Pierce (#6)
Re: BUG #6571: Postgres Kills 'Select 1' query randomly on hot standby databases

Dropping the BEGIN has cleared up the issue. Thank you.

Show quoted text

On 4/3/2012 9:50 AM, John R Pierce wrote:

On 04/03/12 9:40 AM, Tom Lane wrote:

Alex Matzinger<amatzinger@experts-exchange.com> writes:

The connection that is executing the SELECT 1 are generally open

for 1-5

hours before they are killed. The specific connection only executes
SELECT 1. The transaction is simply BEGIN, and then SELECT 1's, no
other query is executed.

Lose the "BEGIN" and it will probably work more nicely.

indeed,a 1-5 hour long transaction means VACUUM can't clean up
anything newer than the oldest active transaction, and thats not per
database, thats cluster-wide.