BUG #8404: JDBC block hot standby replication?

Started by Pius Chanover 12 years ago4 messagesbugs
Jump to latest
#1Pius Chan
pchan@contigo.com

The following bug has been logged on the website:

Bug reference: 8404
Logged by: Pius Chan
Email address: pchan@contigo.com
PostgreSQL version: 9.1.9
Operating system: Linux version 2.6.32-279.14.1.el6.x86_64
Description:

When investigating an "out of memory" issue in hot standby, it is found that
JDBC can block the application of replicated transaction log to hot standby.
The set up of the test is as follows:

Set up:
=======
Set up a master db cluster and hot standby db cluster using async streaming
replication. On master database server, create a database TEST. Implement a
cron job which on every minute:
- drop table if exists heartbeat;
- create table heartbeat (timenow timestamp);
- insert into heartbeat values (now());
- this is to ensure transaction entry is generated at least every minute.

Test 1 - psql does not block replication
========================================
- on hot standy db server, start a psql session (session 1), then submit the
following:
psql> begin;
psql> select now();
..... idle

- start another psql session on hot standby db server and run the following
SQLs to monitor the replication delay:

select current_timestamp, pg_last_xact_replay_timestamp(),
current_timestamp-pg_last_xact_replay_timestamp() as replication_delay,
pg_last_xlog_receive_location();

Result: it is found that session 1 does
not block replication.

Test 2 - JDBC can block replication
====================================
- write a Java test program to connect to hot standby TEST database
- in the test program, submit the following SQLs:

begin;
select now();
then sleep for 2 minutes

- on hot standby db server, start another psql session and monitor the
replication delay with the same SQL.

Results: it is found that during the 2 minutes sleeping time,
pg_last_xlog_receive_location is advancing but pg_last_xact_replay_timestamp
is stuck. The result of "select * from heartbeat" does not change.

Is that a bug in JDBC driver?

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

In reply to: Pius Chan (#1)
Re: BUG #8404: JDBC block hot standby replication?

Hello

This is a well documented feature of Hot-Standby Replication.

see:
http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY

Regards,

-- Valentin

ვალენტინ გოგიჩაშვილი
Valentine Gogichashvili

On Wed, Aug 28, 2013 at 2:30 AM, <pchan@contigo.com> wrote:

Show quoted text

The following bug has been logged on the website:

Bug reference: 8404
Logged by: Pius Chan
Email address: pchan@contigo.com
PostgreSQL version: 9.1.9
Operating system: Linux version 2.6.32-279.14.1.el6.x86_64
Description:

When investigating an "out of memory" issue in hot standby, it is found
that
JDBC can block the application of replicated transaction log to hot
standby.
The set up of the test is as follows:

Set up:
=======
Set up a master db cluster and hot standby db cluster using async streaming
replication. On master database server, create a database TEST. Implement a
cron job which on every minute:
- drop table if exists heartbeat;
- create table heartbeat (timenow timestamp);
- insert into heartbeat values (now());
- this is to ensure transaction entry is generated at least every minute.

Test 1 - psql does not block replication
========================================
- on hot standy db server, start a psql session (session 1), then submit
the
following:
psql> begin;
psql> select now();
..... idle

- start another psql session on hot standby db server and run the following
SQLs to monitor the replication delay:

select current_timestamp, pg_last_xact_replay_timestamp(),
current_timestamp-pg_last_xact_replay_timestamp() as replication_delay,
pg_last_xlog_receive_location();

Result: it is found that session 1 does
not block replication.

Test 2 - JDBC can block replication
====================================
- write a Java test program to connect to hot standby TEST database
- in the test program, submit the following SQLs:

begin;
select now();
then sleep for 2 minutes

- on hot standby db server, start another psql session and monitor the
replication delay with the same SQL.

Results: it is found that during the 2 minutes sleeping time,
pg_last_xlog_receive_location is advancing but
pg_last_xact_replay_timestamp
is stuck. The result of "select * from heartbeat" does not change.

Is that a bug in JDBC driver?

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Valentine Gogichashvili (#2)
Re: BUG #8404: JDBC block hot standby replication?

On 28.08.2013 12:46, Valentine Gogichashvili wrote:

Hello

This is a well documented feature of Hot-Standby Replication.

see:
http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY

That explains how streaming delay and query cancellations work, but the
OP's question was why running the same statements over JDBC behaves
differently than running them over psql.

I'm afraid I don't know the answer. One guess is that when you make the
JDBC connection, you have asked for repeatable read or serializable
isolation level, while psql runs in read committed mode. Running a "show
transaction_isolation" in both would show if that's the case. I'd also
suggest doing "select * from pg_stat_activity" to see if the session
looks the same in both cases.

- Heikki

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Pius Chan
pchan@contigo.com
In reply to: Pius Chan (#1)
FW: BUG #8404: JDBC block hot standby replication?

________________________________________
From: Pius Chan
Sent: Wednesday, August 28, 2013 11:03 AM
To: Heikki Linnakangas; Valentine Gogichashvili
Cc: Frank Moi; Ken Yu; Vincent Lasmarias; Vladimir Kosilov
Subject: RE: [BUGS] BUG #8404: JDBC block hot standby replication?

Hi Heikki, Valentine,

Thanks for your suggestions. However, I don't think this is a feature of Hot-Standby replication. If a long running query in HOT STANDBY takes 15 seconds to finish, then during this 15-second window, it should block the application of replicated transaction log entry so as to maintain SQL statement level read integrity.

(1) in our first test case, from the database log the JDBC test program, when connected to the HOT STANDBY DATABASE "TEST":

====================================================================================================
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [1-1] 192.168.2.80 LOG: duration: 0.225 ms parse <unnamed>: SET extra_float_digits = 3
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [2-1] 192.168.2.80 LOG: duration: 0.010 ms bind <unnamed>: SET extra_float_digits = 3
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [3-1] 192.168.2.80 LOG: duration: 0.040 ms execute <unnamed>: SET extra_float_digits = 3
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [4-1] 192.168.2.80 LOG: duration: 0.048 ms parse S_1: BEGIN
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [5-1] 192.168.2.80 LOG: duration: 0.004 ms bind S_1: BEGIN
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [6-1] 192.168.2.80 LOG: duration: 0.011 ms execute S_1: BEGIN
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [7-1] 192.168.2.80 LOG: duration: 0.235 ms parse <unnamed>: select now()
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [8-1] 192.168.2.80 LOG: duration: 0.031 ms bind <unnamed>: select now()
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [9-1] 192.168.2.80 LOG: duration: 0.028 ms execute <unnamed>: select now()
2013-08-27 17:03:57 PDT [TEST] [postgres] [4586]: [10-1] 192.168.2.80 LOG: duration: 0.056 ms parse <unnamed>: select now()
2013-08-27 17:03:57 PDT [TEST] [postgres] [4586]: [11-1] 192.168.2.80 LOG: duration: 0.031 ms bind <unnamed>: select now()
2013-08-27 17:03:57 PDT [TEST] [postgres] [4586]: [12-1] 192.168.2.80 LOG: duration: 0.011 ms execute <unnamed>: select now()

----------------> 1 minute idle period

2013-08-27 17:03:58 PDT [TEST] [postgres] [4586]: [13-1] 192.168.2.80 LOG: unexpected EOF on client connection
====================================================================================================

During the one minute idle period, the replicated transaction log failed to apply to the HOT STANDBY DATABASE (pg_last_xlog_receive_location advances but pg_last_xact_replay_timestamp stays put).

Comparing to the second test case of using psql:

====================================================================================================
2013-08-28 10:52:55 PDT [TEST] [postgres] [17509]: [4-1] 192.168.2.41 LOG: duration: 0.080 ms statement: begin;
2013-08-28 10:53:15 PDT [TEST] [postgres] [17509]: [5-1] 192.168.2.41 LOG: duration: 0.166 ms statement: select now();

-------------> almost 2 minutes idle period

2013-08-28 10:55:04 PDT [TEST] [postgres] [17509]: [6-1] 192.168.2.41 LOG: duration: 0.046 ms statement: commit;
====================================================================================================

During the two-minute idle period, BOTH the pg_last_xlog_receive_location and pg_last_xact_replay_timestamp advance.

(2) HOT STANDBY database does not allow SERIALIZABLE transaction level:

$ psql -U postgres TEST
psql (9.1.9)
Type "help" for help.

^
TEST=# set transaction isolation level serializable;
ERROR: cannot use serializable mode in a hot standby
HINT: You can use REPEATABLE READ instead.
TEST=#

(3) I use psql to set the transaction isolation level to repeatable read but this still does not block the application of replicated transaction log entry:

2013-08-28 11:01:20 PDT [TEST] [postgres] [20361]: [1-1] 192.168.2.41 LOG: duration: 0.169 ms statement: set transaction isolation level repeatable read;
2013-08-28 11:01:23 PDT [TEST] [postgres] [20361]: [2-1] 192.168.2.41 LOG: duration: 0.061 ms statement: begin;
2013-08-28 11:01:25 PDT [TEST] [postgres] [20361]: [3-1] 192.168.2.41 LOG: duration: 0.387 ms statement: select now();

-----------> 1 minute idle period

2013-08-28 11:02:18 PDT [TEST] [postgres] [20361]: [4-1] 192.168.2.41 LOG: duration: 0.070 ms statement: commit;

During the 1 minute idle period, BOTH the pg_last_xlog_receive_location and pg_last_xact_replay_timestamp still advance.

Regards,

Pius
_______________________________________
From: Heikki Linnakangas [hlinnakangas@vmware.com]
Sent: Wednesday, August 28, 2013 5:32 AM
To: Valentine Gogichashvili
Cc: Pius Chan; PostgreSQL Bugs
Subject: Re: [BUGS] BUG #8404: JDBC block hot standby replication?

On 28.08.2013 12:46, Valentine Gogichashvili wrote:

Hello

This is a well documented feature of Hot-Standby Replication.

see:
http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY

That explains how streaming delay and query cancellations work, but the
OP's question was why running the same statements over JDBC behaves
differently than running them over psql.

I'm afraid I don't know the answer. One guess is that when you make the
JDBC connection, you have asked for repeatable read or serializable
isolation level, while psql runs in read committed mode. Running a "show
transaction_isolation" in both would show if that's the case. I'd also
suggest doing "select * from pg_stat_activity" to see if the session
looks the same in both cases.

- Heikki

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs