JDBC: logical replication and LSN feedback

Started by Yason TRover 8 years ago5 messagesgeneral
Jump to latest
#1Yason TR
yason.tr@gmx.com

Hi all,

I am developing an application which connects to a logical replication slot, to consume the WAL events. These WAL events are then forwarded to a MQ broker.

The heart of the code can be seen as:

while (true) {
Connection connection = null;
PGReplicationStream stream = null;

try {
connection = DriverManager.getConnection("jdbc:postgresql://localhost:5432/db", properties);
stream = connection.unwrap(PGConnection.class).getReplicationAPI().replicationStream().logical().withSlotName("slot").start();

while (true) {
final ByteBuffer buffer = stream.read();

// ... MQ logic here ... omitted ...

stream.setAppliedLSN(stream.getLastReceiveLSN());
stream.setFlushedLSN(stream.getLastReceiveLSN());
}
} catch (final SQLException e) {
// ... log exception ... omitted ...
} finally {
// ... close stream and connection ... omitted ...
}
}

I notice some behavior which I cannot explain and would like to understand so I can alter my code:

- When I restart the application, I notice that the application is retrieving the last event from the previous run again. The result is that this event is sent twice to the MQ broker after a restart of the application. Why is that? Isn't calling `setAppliedLSN(stream.getLastReceiveLSN())` and/or `setFlushedLSN(stream.getLastReceiveLSN())` enough to acknowledge an event, so it will removed from the WAL log and it will not be resent?

- When receiving an event, the corresponding LSN from that event (which is sent in the payload) is not the same as the result of `stream.getLastReceivedLSN()`. Why is that? Which one should I use? Maybe this is correlated to my first question.

- What is the difference between `setAppliedLSN(LSN)` and `setFlushedLSN(LSN)`? The Javadocs are not really helpful here.

FYI, I also asked this question on https://stackoverflow.com/questions/46301578/postgres-jdbc-logical-replication-lsn-feedback.

Thanks a lot and kind regards,

Yason TR

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Achilleas Mantzios
achill@matrix.gatewaynet.com
In reply to: Yason TR (#1)
Re: JDBC: logical replication and LSN feedback

On 19/09/2017 16:37, Yason TR wrote:

Hi all,

I am developing an application which connects to a logical replication slot, to consume the WAL events. These WAL events are then forwarded to a MQ broker.

The heart of the code can be seen as:

while (true) {
Connection connection = null;
PGReplicationStream stream = null;

try {
connection = DriverManager.getConnection("jdbc:postgresql://localhost:5432/db", properties);
stream = connection.unwrap(PGConnection.class).getReplicationAPI().replicationStream().logical().withSlotName("slot").start();

while (true) {
final ByteBuffer buffer = stream.read();

// ... MQ logic here ... omitted ...

stream.setAppliedLSN(stream.getLastReceiveLSN());
stream.setFlushedLSN(stream.getLastReceiveLSN());
}
} catch (final SQLException e) {
// ... log exception ... omitted ...
} finally {
// ... close stream and connection ... omitted ...
}
}

I notice some behavior which I cannot explain and would like to understand so I can alter my code:

- When I restart the application, I notice that the application is retrieving the last event from the previous run again. The result is that this event is sent twice to the MQ broker after a restart of the application. Why is that? Isn't calling `setAppliedLSN(stream.getLastReceiveLSN())` and/or `setFlushedLSN(stream.getLastReceiveLSN())` enough to acknowledge an event, so it will removed from the WAL log and it will not be resent?

- When receiving an event, the corresponding LSN from that event (which is sent in the payload) is not the same as the result of `stream.getLastReceivedLSN()`. Why is that? Which one should I use? Maybe this is correlated to my first question.

- What is the difference between `setAppliedLSN(LSN)` and `setFlushedLSN(LSN)`? The Javadocs are not really helpful here.

The stages of a wal location generally go like : sent -> write -> flush -> replay , at least in terms of physical replication.
I guess applied=replayed ?

Note that from the docs : https://jdbc.postgresql.org/documentation/head/replication.html#logical-replication
it says :
"
In the event that replication has been restarted, it's will start from last successfully processed LSN that was sent via feedback to database.
"

FYI, I also asked this question on https://stackoverflow.com/questions/46301578/postgres-jdbc-logical-replication-lsn-feedback.

Thanks a lot and kind regards,

Yason TR

--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Yason TR
yason.tr@gmx.com
In reply to: Achilleas Mantzios (#2)
Re: JDBC: logical replication and LSN feedback

<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>Should we read &quot;In the event that replication has been restarted, it&#39;s will start from last successfully processed LSN that was sent via feedback to database.&quot; that this last succesfully event will be included (again) after a restart of the replication, or that the next event starting from the this last successfully event will be sent?</div>

<div>&nbsp;</div>

<div>I would expect the second, as this makes the most sense (because the consumers only want each event once), but I am not sure.</div>

<div>&nbsp;</div>

<div>Thanks a lot and kind regards,</div>

<div>&nbsp;</div>

<div>Yason TR</div>

<div>&nbsp;
<div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div style="margin:0 0 10px 0;"><b>Sent:</b>&nbsp;Tuesday, September 19, 2017 at 4:14 PM<br/>
<b>From:</b>&nbsp;&quot;Achilleas Mantzios&quot; &lt;achill@matrix.gatewaynet.com&gt;<br/>
<b>To:</b>&nbsp;pgsql-general@postgresql.org<br/>
<b>Subject:</b>&nbsp;Re: [GENERAL] JDBC: logical replication and LSN feedback</div>

<div name="quoted-content">On 19/09/2017 16:37, Yason TR wrote:<br/>
&gt; Hi all,<br/>
&gt;<br/>
&gt; I am developing an application which connects to a logical replication slot, to consume the WAL events. These WAL events are then forwarded to a MQ broker.<br/>
&gt;<br/>
&gt; The heart of the code can be seen as:<br/>
&gt;<br/>
&gt; while (true) {<br/>
&gt; Connection connection = null;<br/>
&gt; PGReplicationStream stream = null;<br/>
&gt;<br/>
&gt; try {<br/>
&gt; connection = DriverManager.getConnection(&quot;jdbc:postgresql://localhost:5432/db&quot;, properties);<br/>
&gt; stream = connection.unwrap(PGConnection.class).getReplicationAPI().replicationStream().logical().withSlotName(&quot;slot&quot;).start();<br/>
&gt;<br/>
&gt; while (true) {<br/>
&gt; final ByteBuffer buffer = stream.read();<br/>
&gt;<br/>
&gt; // ... MQ logic here ... omitted ...<br/>
&gt;<br/>
&gt; stream.setAppliedLSN(stream.getLastReceiveLSN());<br/>
&gt; stream.setFlushedLSN(stream.getLastReceiveLSN());<br/>
&gt; }<br/>
&gt; } catch (final SQLException e) {<br/>
&gt; // ... log exception ... omitted ...<br/>
&gt; } finally {<br/>
&gt; // ... close stream and connection ... omitted ...<br/>
&gt; }<br/>
&gt; }<br/>
&gt;<br/>
&gt; I notice some behavior which I cannot explain and would like to understand so I can alter my code:<br/>
&gt;<br/>
&gt; - When I restart the application, I notice that the application is retrieving the last event from the previous run again. The result is that this event is sent twice to the MQ broker after a restart of the application. Why is that? Isn&#39;t calling &#96;setAppliedLSN(stream.getLastReceiveLSN())&#96; and/or &#96;setFlushedLSN(stream.getLastReceiveLSN())&#96; enough to acknowledge an event, so it will removed from the WAL log and it will not be resent?<br/>
&gt;<br/>
&gt; - When receiving an event, the corresponding LSN from that event (which is sent in the payload) is not the same as the result of &#96;stream.getLastReceivedLSN()&#96;. Why is that? Which one should I use? Maybe this is correlated to my first question.<br/>
&gt;<br/>
&gt; - What is the difference between &#96;setAppliedLSN(LSN)&#96; and &#96;setFlushedLSN(LSN)&#96;? The Javadocs are not really helpful here.<br/>
<br/>
The stages of a wal location generally go like : sent -&gt; write -&gt; flush -&gt; replay , at least in terms of physical replication.<br/>
I guess applied=replayed ?<br/>
<br/>
Note that from the docs : <a href="https://jdbc.postgresql.org/documentation/head/replication.html#logical-replication&quot; target="_blank">https://jdbc.postgresql.org/documentation/head/replication.html#logical-replication&lt;/a&gt;&lt;br/&gt;
it says :<br/>
&quot;<br/>
In the event that replication has been restarted, it&#39;s will start from last successfully processed LSN that was sent via feedback to database.<br/>
&quot;<br/>
<br/>
&gt;<br/>
&gt; FYI, I also asked this question on <a href="https://stackoverflow.com/questions/46301578/postgres-jdbc-logical-replication-lsn-feedback&quot; target="_blank">https://stackoverflow.com/questions/46301578/postgres-jdbc-logical-replication-lsn-feedback&lt;/a&gt;.&lt;br/&gt;
&gt;<br/>
&gt; Thanks a lot and kind regards,<br/>
&gt;<br/>
&gt; Yason TR<br/>
&gt;<br/>
&gt;<br/>
<br/>
--<br/>
Achilleas Mantzios<br/>
IT DEV Lead<br/>
IT DEPT<br/>
Dynacom Tankers Mgmt<br/>
<br/>
<br/>
<br/>
--<br/>
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)<br/>
To make changes to your subscription:<br/>
<a href="http://www.postgresql.org/mailpref/pgsql-general&quot; target="_blank">http://www.postgresql.org/mailpref/pgsql-general&lt;/a&gt;&lt;/div&gt;
</div>
</div></div></body></html>

#4Dave Cramer
pg@fastcrypt.com
In reply to: Yason TR (#3)
Re: [GENERAL] JDBC: logical replication and LSN feedback

+list

First off you are going to get considerably better response from the JDBC
list or our github project.

Looking at the code; in order to ensure the backend has received the
acknowledgement you need to call forceUpdateStatus

Otherwise it may not receive the ack

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

On 19 September 2017 at 07:53, Yason TR <yason.tr@gmx.com> wrote:

Show quoted text

Should we read "In the event that replication has been restarted, it's
will start from last successfully processed LSN that was sent via feedback
to database." that this last succesfully event will be included (again)
after a restart of the replication, or that the next event starting from
the this last successfully event will be sent?

I would expect the second, as this makes the most sense (because the
consumers only want each event once), but I am not sure.

Thanks a lot and kind regards,

Yason TR

*Sent:* Tuesday, September 19, 2017 at 4:14 PM
*From:* "Achilleas Mantzios" <achill@matrix.gatewaynet.com>
*To:* pgsql-general@postgresql.org
*Subject:* Re: [GENERAL] JDBC: logical replication and LSN feedback
On 19/09/2017 16:37, Yason TR wrote:

Hi all,

I am developing an application which connects to a logical replication

slot, to consume the WAL events. These WAL events are then forwarded to a
MQ broker.

The heart of the code can be seen as:

while (true) {
Connection connection = null;
PGReplicationStream stream = null;

try {
connection = DriverManager.getConnection("jdbc:postgresql://localhost:5432/db",

properties);

stream = connection.unwrap(PGConnection.class).getReplicationAPI().

replicationStream().logical().withSlotName("slot").start();

while (true) {
final ByteBuffer buffer = stream.read();

// ... MQ logic here ... omitted ...

stream.setAppliedLSN(stream.getLastReceiveLSN());
stream.setFlushedLSN(stream.getLastReceiveLSN());
}
} catch (final SQLException e) {
// ... log exception ... omitted ...
} finally {
// ... close stream and connection ... omitted ...
}
}

I notice some behavior which I cannot explain and would like to

understand so I can alter my code:

- When I restart the application, I notice that the application is

retrieving the last event from the previous run again. The result is that
this event is sent twice to the MQ broker after a restart of the
application. Why is that? Isn't calling `setAppliedLSN(stream.getLastReceiveLSN())`
and/or `setFlushedLSN(stream.getLastReceiveLSN())` enough to acknowledge
an event, so it will removed from the WAL log and it will not be resent?

- When receiving an event, the corresponding LSN from that event (which

is sent in the payload) is not the same as the result of
`stream.getLastReceivedLSN()`. Why is that? Which one should I use? Maybe
this is correlated to my first question.

- What is the difference between `setAppliedLSN(LSN)` and

`setFlushedLSN(LSN)`? The Javadocs are not really helpful here.

The stages of a wal location generally go like : sent -> write -> flush ->
replay , at least in terms of physical replication.
I guess applied=replayed ?

Note that from the docs : https://jdbc.postgresql.org/documentation/head/
replication.html#logical-replication
it says :
"
In the event that replication has been restarted, it's will start from
last successfully processed LSN that was sent via feedback to database.
"

FYI, I also asked this question on https://stackoverflow.com/

questions/46301578/postgres-jdbc-logical-replication-lsn-feedback.

Thanks a lot and kind regards,

Yason TR

--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Yason TR
yason.tr@gmx.com
In reply to: Dave Cramer (#4)
Re: [GENERAL] JDBC: logical replication and LSN feedback

<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>Thanks, I did not realize that acknowledgements are delayed&nbsp;until the next forceUpdateStatus() call. From the javadocs, I understood that the a read() call also does a forceUpdateStatus() internally, so a manual forceUpdateStatus() is not required. Maybe a forceUpdateStatus() only happens during a read() when the read() effectively received some messages?</div>

<div>&nbsp;</div>

<div>I will start my experiments again and let you know.</div>

<div>&nbsp;</div>

<div>Kind regards,</div>

<div>&nbsp;</div>

<div>Yason TR</div>

<div>&nbsp;
<div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div style="margin:0 0 10px 0;"><b>Sent:</b>&nbsp;Wednesday, September 20, 2017 at 2:22 PM<br/>
<b>From:</b>&nbsp;&quot;Dave Cramer&quot; &lt;pg@fastcrypt.com&gt;<br/>
<b>To:</b>&nbsp;&quot;Yason TR&quot; &lt;yason.tr@gmx.com&gt;<br/>
<b>Cc:</b>&nbsp;&quot;Achilleas Mantzios&quot; &lt;achill@matrix.gatewaynet.com&gt;, &quot;pgsql-general@postgresql.org&quot; &lt;pgsql-general@postgresql.org&gt;, List &lt;pgsql-jdbc@postgresql.org&gt;<br/>
<b>Subject:</b>&nbsp;Re: [GENERAL] JDBC: logical replication and LSN feedback</div>

<div name="quoted-content">
<div>
<div>+list</div>

<div>&nbsp;</div>
First off you are going to get considerably better response from the JDBC list or our github project.

<div>&nbsp;</div>

<div>Looking at the code; in order to ensure the backend has received the acknowledgement you need to call&nbsp;<span style="color: rgb(0,0,0);font-family: Menlo;font-size: 9.0pt;">forceUpdateStatus</span></div>

<div>&nbsp;</div>

<div><span style="color: rgb(0,0,0);font-family: Menlo;font-size: 9.0pt;">Otherwise it may not receive the ack</span></div>

<div>&nbsp;</div>

<div>&nbsp;</div>

<div>&nbsp;
<div>&nbsp;</div>

<div>&nbsp;</div>

<div>&nbsp;</div>

<div>&nbsp;</div>
</div>
</div>

<div class="gmail_extra">&nbsp;
<div>
<div class="gmail_signature">
<div>
<div>Dave Cramer<br/>
<br/>
<a href="mailto:davec@postgresintl.com" onclick="parent.window.location.href=&#39;davec@postgresintl.com&#39;; return false;" target="_blank">davec@postgresintl.com</a></div>

<div><a href="http://www.postgresintl.com&quot; target="_blank">www.postgresintl.com</a></div>
</div>
</div>
</div>
&nbsp;

<div class="gmail_quote">On 19 September 2017 at 07:53, Yason TR <span>&lt;<a href="mailto:yason.tr@gmx.com" onclick="parent.window.location.href=&#39;yason.tr@gmx.com&#39;; return false;" target="_blank">yason.tr@gmx.com</a>&gt;</span> wrote:

<blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-left: 1.0ex;">
<div>
<div style="font-family: Verdana;font-size: 12.0px;">
<div>Should we read &quot;In the event that replication has been restarted, it&#39;s will start from last successfully processed LSN that was sent via feedback to database.&quot; that this last succesfully event will be included (again) after a restart of the replication, or that the next event starting from the this last successfully event will be sent?</div>

<div>&nbsp;</div>

<div>I would expect the second, as this makes the most sense (because the consumers only want each event once), but I am not sure.</div>

<div><span>&nbsp;</span></div>

<div><span>Thanks a lot and kind regards,</span></div>

<div><span>&nbsp;</span></div>

<div><span>Yason TR</span></div>

<div>&nbsp;
<div style="margin: 10.0px 5.0px 5.0px 10.0px;padding: 10.0px 0 10.0px 10.0px;border-left: 2.0px solid rgb(195,217,229);">
<div style="margin: 0 0 10.0px 0;"><b>Sent:</b>&nbsp;Tuesday, September 19, 2017 at 4:14 PM<br/>
<b>From:</b>&nbsp;&quot;Achilleas Mantzios&quot; &lt;<a href="mailto:achill@matrix.gatewaynet.com" onclick="parent.window.location.href=&#39;achill@matrix.gatewaynet.com&#39;; return false;" target="_blank">achill@matrix.gatewaynet.com</a>&gt;<br/>
<b>To:</b>&nbsp;<a href="mailto:pgsql-general@postgresql.org" onclick="parent.window.location.href=&#39;pgsql-general@postgresql.org&#39;; return false;" target="_blank">pgsql-general@postgresql.org</a><br/>
<b>Subject:</b>&nbsp;Re: [GENERAL] JDBC: logical replication and LSN feedback</div>

<div>
<div class="h5">
<div>On 19/09/2017 16:37, Yason TR wrote:<br/>
&gt; Hi all,<br/>
&gt;<br/>
&gt; I am developing an application which connects to a logical replication slot, to consume the WAL events. These WAL events are then forwarded to a MQ broker.<br/>
&gt;<br/>
&gt; The heart of the code can be seen as:<br/>
&gt;<br/>
&gt; while (true) {<br/>
&gt; Connection connection = null;<br/>
&gt; PGReplicationStream stream = null;<br/>
&gt;<br/>
&gt; try {<br/>
&gt; connection = DriverManager.getConnection(&quot;jdbc:postgresql://localhost:5432/db&quot;, properties);<br/>
&gt; stream = connection.unwrap(PGConnection.class).getReplicationAPI().replicationStream().logical().withSlotName(&quot;slot&quot;).start();<br/>
&gt;<br/>
&gt; while (true) {<br/>
&gt; final ByteBuffer buffer = stream.read();<br/>
&gt;<br/>
&gt; // ... MQ logic here ... omitted ...<br/>
&gt;<br/>
&gt; stream.setAppliedLSN(stream.getLastReceiveLSN());<br/>
&gt; stream.setFlushedLSN(stream.getLastReceiveLSN());<br/>
&gt; }<br/>
&gt; } catch (final SQLException e) {<br/>
&gt; // ... log exception ... omitted ...<br/>
&gt; } finally {<br/>
&gt; // ... close stream and connection ... omitted ...<br/>
&gt; }<br/>
&gt; }<br/>
&gt;<br/>
&gt; I notice some behavior which I cannot explain and would like to understand so I can alter my code:<br/>
&gt;<br/>
&gt; - When I restart the application, I notice that the application is retrieving the last event from the previous run again. The result is that this event is sent twice to the MQ broker after a restart of the application. Why is that? Isn&#39;t calling &#96;setAppliedLSN(stream.getLastReceiveLSN())&#96; and/or &#96;setFlushedLSN(stream.getLastReceiveLSN())&#96; enough to acknowledge an event, so it will removed from the WAL log and it will not be resent?<br/>
&gt;<br/>
&gt; - When receiving an event, the corresponding LSN from that event (which is sent in the payload) is not the same as the result of &#96;stream.getLastReceivedLSN()&#96;. Why is that? Which one should I use? Maybe this is correlated to my first question.<br/>
&gt;<br/>
&gt; - What is the difference between &#96;setAppliedLSN(LSN)&#96; and &#96;setFlushedLSN(LSN)&#96;? The Javadocs are not really helpful here.<br/>
<br/>
The stages of a wal location generally go like : sent -&gt; write -&gt; flush -&gt; replay , at least in terms of physical replication.<br/>
I guess applied=replayed ?<br/>
<br/>
Note that from the docs : <a href="https://jdbc.postgresql.org/documentation/head/replication.html#logical-replication&quot; target="_blank">https://jdbc.postgresql.org/documentation/head/replication.html#logical-replication&lt;/a&gt;&lt;br/&gt;
it says :<br/>
&quot;<br/>
In the event that replication has been restarted, it&#39;s will start from last successfully processed LSN that was sent via feedback to database.<br/>
&quot;<br/>
<br/>
&gt;<br/>
&gt; FYI, I also asked this question on <a href="https://stackoverflow.com/questions/46301578/postgres-jdbc-logical-replication-lsn-feedback&quot; target="_blank">https://stackoverflow.com/questions/46301578/postgres-jdbc-logical-replication-lsn-feedback&lt;/a&gt;.&lt;br/&gt;
&gt;<br/>
&gt; Thanks a lot and kind regards,<br/>
&gt;<br/>
&gt; Yason TR<br/>
&gt;<br/>
&gt;<br/>
<br/>
--<br/>
Achilleas Mantzios<br/>
IT DEV Lead<br/>
IT DEPT<br/>
Dynacom Tankers Mgmt<br/>
<br/>
<br/>
<br/>
--<br/>
Sent via pgsql-general mailing list (<a href="mailto:pgsql-general@postgresql.org" onclick="parent.window.location.href=&#39;pgsql-general@postgresql.org&#39;; return false;" target="_blank">pgsql-general@postgresql.org</a>)<br/>
To make changes to your subscription:<br/>
<a href="http://www.postgresql.org/mailpref/pgsql-general&quot; target="_blank">http://www.postgresql.org/mailpref/pgsql-general&lt;/a&gt;&lt;/div&gt;
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</div></div></body></html>