Thread: BUG #8404: JDBC block hot standby replication?

BUG #8404: JDBC block hot standby replication?

From
pchan@contigo.com
Date:
The following bug has been logged on the website:

Bug reference:      8404
Logged by:          Pius Chan
Email address:      pchan@contigo.com
PostgreSQL version: 9.1.9
Operating system:   Linux version 2.6.32-279.14.1.el6.x86_64
Description:

When investigating an "out of memory" issue in hot standby, it is found that
JDBC can block the application of replicated transaction log to hot standby.
The set up of the test is as follows:


Set up:
=======
Set up a master db cluster and hot standby db cluster using async streaming
replication. On master database server, create a database TEST. Implement a
cron job which on every minute:
- drop table if exists heartbeat;
- create table heartbeat (timenow timestamp);
- insert into heartbeat values (now());
- this is to ensure transaction entry is generated at least every minute.


Test 1 - psql does not block replication
========================================
- on hot standy db server, start a psql session (session 1), then submit the
following:
psql> begin;
psql> select now();
..... idle


- start another psql session on hot standby db server and run the following
SQLs to monitor the replication delay:


select current_timestamp, pg_last_xact_replay_timestamp(),
current_timestamp-pg_last_xact_replay_timestamp() as replication_delay,
pg_last_xlog_receive_location();


Result: it is found that session 1 does
not block replication.




Test 2 - JDBC can block replication
====================================
- write a Java test program to connect to hot standby TEST database
- in the test program, submit the following SQLs:

  begin;
  select now();
  then sleep for 2 minutes


- on hot standby db server, start another psql session and monitor the
replication delay with the same SQL.


Results: it is found that during the 2 minutes sleeping time,
pg_last_xlog_receive_location is advancing but pg_last_xact_replay_timestamp
is stuck. The result of "select * from heartbeat" does not change.


Is that a bug in JDBC driver?

Re: BUG #8404: JDBC block hot standby replication?

From
Valentine Gogichashvili
Date:
Hello

This is a well documented feature of Hot-Standby Replication.

see:
http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html#R=
UNTIME-CONFIG-REPLICATION-STANDBY

Regards,

-- Valentin

=E1=83=95=E1=83=90=E1=83=9A=E1=83=94=E1=83=9C=E1=83=A2=E1=83=98=E1=83=9C =
=E1=83=92=E1=83=9D=E1=83=92=E1=83=98=E1=83=A9=E1=83=90=E1=83=A8=E1=83=95=E1=
=83=98=E1=83=9A=E1=83=98
Valentine Gogichashvili


On Wed, Aug 28, 2013 at 2:30 AM, <pchan@contigo.com> wrote:

> The following bug has been logged on the website:
>
> Bug reference:      8404
> Logged by:          Pius Chan
> Email address:      pchan@contigo.com
> PostgreSQL version: 9.1.9
> Operating system:   Linux version 2.6.32-279.14.1.el6.x86_64
> Description:
>
> When investigating an "out of memory" issue in hot standby, it is found
> that
> JDBC can block the application of replicated transaction log to hot
> standby.
> The set up of the test is as follows:
>
>
> Set up:
> =3D=3D=3D=3D=3D=3D=3D
> Set up a master db cluster and hot standby db cluster using async streami=
ng
> replication. On master database server, create a database TEST. Implement=
 a
> cron job which on every minute:
> - drop table if exists heartbeat;
> - create table heartbeat (timenow timestamp);
> - insert into heartbeat values (now());
> - this is to ensure transaction entry is generated at least every minute.
>
>
> Test 1 - psql does not block replication
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> - on hot standy db server, start a psql session (session 1), then submit
> the
> following:
> psql> begin;
> psql> select now();
> ..... idle
>
>
> - start another psql session on hot standby db server and run the followi=
ng
> SQLs to monitor the replication delay:
>
>
> select current_timestamp, pg_last_xact_replay_timestamp(),
> current_timestamp-pg_last_xact_replay_timestamp() as replication_delay,
> pg_last_xlog_receive_location();
>
>
> Result: it is found that session 1 does
> not block replication.
>
>
>
>
> Test 2 - JDBC can block replication
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> - write a Java test program to connect to hot standby TEST database
> - in the test program, submit the following SQLs:
>
>   begin;
>   select now();
>   then sleep for 2 minutes
>
>
> - on hot standby db server, start another psql session and monitor the
> replication delay with the same SQL.
>
>
> Results: it is found that during the 2 minutes sleeping time,
> pg_last_xlog_receive_location is advancing but
> pg_last_xact_replay_timestamp
> is stuck. The result of "select * from heartbeat" does not change.
>
>
> Is that a bug in JDBC driver?
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: BUG #8404: JDBC block hot standby replication?

From
Heikki Linnakangas
Date:
On 28.08.2013 12:46, Valentine Gogichashvili wrote:
> Hello
>
> This is a well documented feature of Hot-Standby Replication.
>
> see:
> http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY

That explains how streaming delay and query cancellations work, but the
OP's question was why running the same statements over JDBC behaves
differently than running them over psql.

I'm afraid I don't know the answer. One guess is that when you make the
JDBC connection, you have asked for repeatable read or serializable
isolation level, while psql runs in read committed mode. Running a "show
transaction_isolation" in both would show if that's the case. I'd also
suggest doing "select * from pg_stat_activity" to see if the session
looks the same in both cases.

- Heikki

FW: BUG #8404: JDBC block hot standby replication?

From
Pius Chan
Date:
________________________________________
From: Pius Chan
Sent: Wednesday, August 28, 2013 11:03 AM
To: Heikki Linnakangas; Valentine Gogichashvili
Cc: Frank Moi; Ken Yu; Vincent Lasmarias; Vladimir Kosilov
Subject: RE: [BUGS] BUG #8404: JDBC block hot standby replication?

Hi Heikki, Valentine,

Thanks for your suggestions. However, I don't think this is a feature of Ho=
t-Standby replication. If a long running query in HOT STANDBY takes 15 seco=
nds to finish, then during this 15-second window, it should block the appli=
cation of replicated transaction log entry so as to maintain SQL statement =
level read integrity.


(1) in our first test case, from the database log the JDBC test program, wh=
en connected to the HOT STANDBY DATABASE "TEST":

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [1-1] 192.168.2.80 LOG:  =
duration: 0.225 ms  parse <unnamed>: SET extra_float_digits =3D 3
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [2-1] 192.168.2.80 LOG:  =
duration: 0.010 ms  bind <unnamed>: SET extra_float_digits =3D 3
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [3-1] 192.168.2.80 LOG:  =
duration: 0.040 ms  execute <unnamed>: SET extra_float_digits =3D 3
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [4-1] 192.168.2.80 LOG:  =
duration: 0.048 ms  parse S_1: BEGIN
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [5-1] 192.168.2.80 LOG:  =
duration: 0.004 ms  bind S_1: BEGIN
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [6-1] 192.168.2.80 LOG:  =
duration: 0.011 ms  execute S_1: BEGIN
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [7-1] 192.168.2.80 LOG:  =
duration: 0.235 ms  parse <unnamed>: select now()
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [8-1] 192.168.2.80 LOG:  =
duration: 0.031 ms  bind <unnamed>: select now()
2013-08-27 17:01:57 PDT [TEST] [postgres] [4586]: [9-1] 192.168.2.80 LOG:  =
duration: 0.028 ms  execute <unnamed>: select now()
2013-08-27 17:03:57 PDT [TEST] [postgres] [4586]: [10-1] 192.168.2.80 LOG: =
 duration: 0.056 ms  parse <unnamed>: select now()
2013-08-27 17:03:57 PDT [TEST] [postgres] [4586]: [11-1] 192.168.2.80 LOG: =
 duration: 0.031 ms  bind <unnamed>: select now()
2013-08-27 17:03:57 PDT [TEST] [postgres] [4586]: [12-1] 192.168.2.80 LOG: =
 duration: 0.011 ms  execute <unnamed>: select now()

         ----------------> 1 minute idle period

2013-08-27 17:03:58 PDT [TEST] [postgres] [4586]: [13-1] 192.168.2.80 LOG: =
 unexpected EOF on client connection
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

During the one minute idle period, the replicated transaction log failed to=
 apply to the HOT STANDBY DATABASE (pg_last_xlog_receive_location advances =
but pg_last_xact_replay_timestamp stays put).


Comparing to the second test case of using psql:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
2013-08-28 10:52:55 PDT [TEST] [postgres] [17509]: [4-1] 192.168.2.41 LOG: =
 duration: 0.080 ms  statement: begin;
2013-08-28 10:53:15 PDT [TEST] [postgres] [17509]: [5-1] 192.168.2.41 LOG: =
 duration: 0.166 ms  statement: select now();

 -------------> almost 2 minutes idle period

2013-08-28 10:55:04 PDT [TEST] [postgres] [17509]: [6-1] 192.168.2.41 LOG: =
 duration: 0.046 ms  statement: commit;
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

During the two-minute idle period, BOTH the pg_last_xlog_receive_location a=
nd pg_last_xact_replay_timestamp advance.


(2) HOT STANDBY database does not allow SERIALIZABLE transaction level:

$ psql -U postgres TEST
psql (9.1.9)
Type "help" for help.

                                        ^
TEST=3D# set transaction isolation level serializable;
ERROR:  cannot use serializable mode in a hot standby
HINT:  You can use REPEATABLE READ instead.
TEST=3D#


(3) I use psql to set the transaction isolation level to repeatable read bu=
t this still does not block the application of replicated transaction log e=
ntry:

2013-08-28 11:01:20 PDT [TEST] [postgres] [20361]: [1-1] 192.168.2.41 LOG: =
 duration: 0.169 ms  statement: set transaction isolation level repeatable =
read;
2013-08-28 11:01:23 PDT [TEST] [postgres] [20361]: [2-1] 192.168.2.41 LOG: =
 duration: 0.061 ms  statement: begin;
2013-08-28 11:01:25 PDT [TEST] [postgres] [20361]: [3-1] 192.168.2.41 LOG: =
 duration: 0.387 ms  statement: select now();

-----------> 1 minute idle period

2013-08-28 11:02:18 PDT [TEST] [postgres] [20361]: [4-1] 192.168.2.41 LOG: =
 duration: 0.070 ms  statement: commit;

During the 1 minute idle period, BOTH the pg_last_xlog_receive_location and=
 pg_last_xact_replay_timestamp still advance.




Regards,

Pius
_______________________________________
From: Heikki Linnakangas [hlinnakangas@vmware.com]
Sent: Wednesday, August 28, 2013 5:32 AM
To: Valentine Gogichashvili
Cc: Pius Chan; PostgreSQL Bugs
Subject: Re: [BUGS] BUG #8404: JDBC block hot standby replication?

On 28.08.2013 12:46, Valentine Gogichashvili wrote:
> Hello
>
> This is a well documented feature of Hot-Standby Replication.
>
> see:
> http://www.postgresql.org/docs/9.2/static/runtime-config-replication.html=
#RUNTIME-CONFIG-REPLICATION-STANDBY

That explains how streaming delay and query cancellations work, but the
OP's question was why running the same statements over JDBC behaves
differently than running them over psql.

I'm afraid I don't know the answer. One guess is that when you make the
JDBC connection, you have asked for repeatable read or serializable
isolation level, while psql runs in read committed mode. Running a "show
transaction_isolation" in both would show if that's the case. I'd also
suggest doing "select * from pg_stat_activity" to see if the session
looks the same in both cases.

- Heikki