Primary keepalive message not appearing in Logical Streaming Replication

Started by Virendra Negiover 6 years ago8 messages
#1Virendra Negi
viren.negi@teliax.com

Implemented the Logical Streaming Replication thing are working fine I see
the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message" yet. I had tried setting
the *tcp_keepalive_interval*, *tcp_keepalives_idle* both from client
runtime paramter and well as from postgresql.conf still no clue of it.

Any information around it?

#2Virendra Negi
viren.negi@teliax.com
In reply to: Virendra Negi (#1)
Re: Primary keepalive message not appearing in Logical Streaming Replication

I forgot to mention the plugin I have been using along with logical
replication

its wal2json.

On Friday, September 13, 2019, Virendra Negi <viren.negi@teliax.com> wrote:

Show quoted text

Implemented the Logical Streaming Replication thing are working fine I see
the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message" yet. I had tried
setting the *tcp_keepalive_interval*, *tcp_keepalives_idle* both from
client runtime paramter and well as from postgresql.conf still no clue of
it.

Any information around it?

#3Michael Loftis
mloftis@wgops.com
In reply to: Virendra Negi (#1)
Re: Primary keepalive message not appearing in Logical Streaming Replication

On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com> wrote:

Implemented the Logical Streaming Replication thing are working fine I see
the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message" yet. I had tried
setting the *tcp_keepalive_interval*, *tcp_keepalives_idle* both from
client runtime paramter and well as from postgresql.conf still no clue of
it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS
TCP stack and are not visible to the applications at all.

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

#4Virendra Negi
viren.negi@teliax.com
In reply to: Michael Loftis (#3)
Re: Primary keepalive message not appearing in Logical Streaming Replication

Agreed but why is there a message specification for it describe in the
documentation and it ask to client reply back if a particular *bit* is
set.(1 means that the client should reply to this message as soon as
possible, to avoid a timeout disconnect. 0 otherwise)

Primary keepalive message (B)
Byte1('k')

Identifies the message as a sender keepalive.
Int64

The current end of WAL on the server.
Int64

The server's system clock at the time of transmission, as microseconds
since midnight on 2000-01-01.
Byte1

1 means that the client should reply to this message as soon as possible,
to avoid a timeout disconnect. 0 otherwise.

The receiving process can send replies back to the sender at any time,
using one of the following message formats (also in the payload of a
CopyData message):

On Sun, Sep 15, 2019 at 7:39 PM Michael Loftis <mloftis@wgops.com> wrote:

Show quoted text

On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com> wrote:

Implemented the Logical Streaming Replication thing are working fine I
see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message" yet. I had tried
setting the *tcp_keepalive_interval*, *tcp_keepalives_idle* both from
client runtime paramter and well as from postgresql.conf still no clue of
it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS
TCP stack and are not visible to the applications at all.

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

#5Virendra Negi
viren.negi@teliax.com
In reply to: Virendra Negi (#4)
Re: Primary keepalive message not appearing in Logical Streaming Replication

Oh I miss the documentation link there you go
https://www.postgresql.org/docs/9.5/protocol-replication.html

On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com> wrote:

Show quoted text

Agreed but why is there a message specification for it describe in the
documentation and it ask to client reply back if a particular *bit* is
set.(1 means that the client should reply to this message as soon as
possible, to avoid a timeout disconnect. 0 otherwise)

Primary keepalive message (B)
Byte1('k')

Identifies the message as a sender keepalive.
Int64

The current end of WAL on the server.
Int64

The server's system clock at the time of transmission, as microseconds
since midnight on 2000-01-01.
Byte1

1 means that the client should reply to this message as soon as possible,
to avoid a timeout disconnect. 0 otherwise.

The receiving process can send replies back to the sender at any time,
using one of the following message formats (also in the payload of a
CopyData message):

On Sun, Sep 15, 2019 at 7:39 PM Michael Loftis <mloftis@wgops.com> wrote:

On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com>
wrote:

Implemented the Logical Streaming Replication thing are working fine I
see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message" yet. I had tried
setting the *tcp_keepalive_interval*, *tcp_keepalives_idle* both from
client runtime paramter and well as from postgresql.conf still no clue of
it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS
TCP stack and are not visible to the applications at all.

--

"Genius might be described as a supreme capacity for getting its
possessors
into trouble of all kinds."
-- Samuel Butler

#6Michael Loftis
mloftis@wgops.com
In reply to: Virendra Negi (#5)
Re: Primary keepalive message not appearing in Logical Streaming Replication

On Sun, Sep 15, 2019 at 08:36 Virendra Negi <viren.negi@teliax.com> wrote:

Oh I miss the documentation link there you go
https://www.postgresql.org/docs/9.5/protocol-replication.html

On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com>
wrote:

Agreed but why is there a message specification for it describe in the
documentation and it ask to client reply back if a particular *bit* is
set.(1 means that the client should reply to this message as soon as
possible, to avoid a timeout disconnect. 0 otherwise)

This is unrelated to TCP keepalive. I honestly don't know where the knob is
to turn these on but the configuration variables you quoted earlier I am
familiar with and they are not it. Perhaps someone else can chime in with
how to enable the protocol level keepalive in replication.

Primary keepalive message (B)
Byte1('k')

Identifies the message as a sender keepalive.
Int64

The current end of WAL on the server.
Int64

The server's system clock at the time of transmission, as microseconds
since midnight on 2000-01-01.
Byte1

1 means that the client should reply to this message as soon as possible,
to avoid a timeout disconnect. 0 otherwise.

The receiving process can send replies back to the sender at any time,
using one of the following message formats (also in the payload of a
CopyData message):

On Sun, Sep 15, 2019 at 7:39 PM Michael Loftis <mloftis@wgops.com> wrote:

On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com>
wrote:

Implemented the Logical Streaming Replication thing are working fine I
see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message" yet. I had tried
setting the *tcp_keepalive_interval*, *tcp_keepalives_idle* both from
client runtime paramter and well as from postgresql.conf still no clue of
it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS
TCP stack and are not visible to the applications at all.

--

"Genius might be described as a supreme capacity for getting its
possessors
into trouble of all kinds."
-- Samuel Butler

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

#7Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Michael Loftis (#6)
Re: Primary keepalive message not appearing in Logical Streaming Replication

On Sun, Sep 15, 2019 at 09:44:14AM -0600, Michael Loftis wrote:

On Sun, Sep 15, 2019 at 08:36 Virendra Negi <viren.negi@teliax.com> wrote:

Oh I miss the documentation link there you go
https://www.postgresql.org/docs/9.5/protocol-replication.html

On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com>
wrote:

Agreed but why is there a message specification for it describe in the
documentation and it ask to client reply back if a particular *bit* is
set.(1 means that the client should reply to this message as soon as
possible, to avoid a timeout disconnect. 0 otherwise)

This is unrelated to TCP keepalive. I honestly don't know where the knob is
to turn these on but the configuration variables you quoted earlier I am
familiar with and they are not it. Perhaps someone else can chime in with
how to enable the protocol level keepalive in replication.

Pretty sure it's wal_sender_timeout. Which by default is 60s, but if you
tune it down it should send keepalives more often.

See WalSndKeepaliveIfNecessary in [1]https://github.com/postgres/postgres/blob/master/src/backend/replication/walsender.c#L3425:

[1]: https://github.com/postgres/postgres/blob/master/src/backend/replication/walsender.c#L3425

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#8Jeff Janes
jeff.janes@gmail.com
In reply to: Michael Loftis (#6)
Re: Primary keepalive message not appearing in Logical Streaming Replication

On Sun, Sep 15, 2019 at 11:44 AM Michael Loftis <mloftis@wgops.com> wrote:

On Sun, Sep 15, 2019 at 08:36 Virendra Negi <viren.negi@teliax.com> wrote:

Oh I miss the documentation link there you go
https://www.postgresql.org/docs/9.5/protocol-replication.html

On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com>
wrote:

Agreed but why is there a message specification for it describe in the
documentation and it ask to client reply back if a particular *bit* is
set.(1 means that the client should reply to this message as soon as
possible, to avoid a timeout disconnect. 0 otherwise)

This is unrelated to TCP keepalive. I honestly don't know where the knob
is to turn these on but the configuration variables you quoted earlier I am
familiar with and they are not it. Perhaps someone else can chime in with
how to enable the protocol level keepalive in replication.

Protocol-level keepalives are governed by "wal_sender_timeout"

Cheers,

Jeff