pg_basebackup: could not receive data from WAL stream

Started by greigwiseover 7 years ago4 messagesgeneral
Jump to latest
#1greigwise
greigwise@comcast.net

Hello.

On postgresql 10.5, my pg_basebackup is failing with this error:

pg_basebackup: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request

In the postgres log files, I'm seeing:

2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG: terminating
walsender process due to replication timeout

I'm running the following command right on the database server itself:

pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z

It seems to be an intermittent problem.. I've had it fail or succeed about
50/50. I even bumped up the wal_sender_timeout to 2000. One notable thing
is that I'm running on an ec2 instance on AWS.

Any advice would be helpful.

Greig Wise

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#2greigwise
greigwise@comcast.net
In reply to: greigwise (#1)
Re: pg_basebackup: could not receive data from WAL stream

I should also add that when it fails, it's always right at the very end of
the backup when it's very nearly done or maybe even after it's done.

Thanks again.

Greig

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: greigwise (#1)
Re: pg_basebackup: could not receive data from WAL stream

On 09/01/2018 09:06 PM, greigwise wrote:

Hello.

On postgresql 10.5, my pg_basebackup is failing with this error:

pg_basebackup: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request

In the postgres log files, I'm seeing:

2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG: terminating
walsender process due to replication timeout

I'm running the following command right on the database server itself:

pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z

It seems to be an intermittent problem.. I've had it fail or succeed about
50/50. I even bumped up the wal_sender_timeout to 2000. One notable thing
is that I'm running on an ec2 instance on AWS.

The unit for wal_sender_timeout is ms so the above is 2 seconds whereas
the default value is 60 seconds(60s in postgresql.conf file).

See below for setting units in file:

https://www.postgresql.org/docs/10/static/config-setting.html

Also what is your max_wal_senders setting?

Any advice would be helpful.

Greig Wise

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Kaixi Luo
kaixiluo@gmail.com
In reply to: Adrian Klaver (#3)
Re: pg_basebackup: could not receive data from WAL stream

wal_sender_timeout should be as long as necessary. Each wal file is 16MB,
so it should be *at least* as long as the time needed to transfer
16MB*wal_keep_segments. Take a look at the size of your pg_xlog folder.

On Sun, Sep 2, 2018 at 3:41 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 09/01/2018 09:06 PM, greigwise wrote:

Hello.

On postgresql 10.5, my pg_basebackup is failing with this error:

pg_basebackup: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request

In the postgres log files, I'm seeing:

2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG:

terminating

walsender process due to replication timeout

I'm running the following command right on the database server itself:

pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z

It seems to be an intermittent problem.. I've had it fail or succeed

about

50/50. I even bumped up the wal_sender_timeout to 2000. One notable

thing

is that I'm running on an ec2 instance on AWS.

The unit for wal_sender_timeout is ms so the above is 2 seconds whereas
the default value is 60 seconds(60s in postgresql.conf file).

See below for setting units in file:

https://www.postgresql.org/docs/10/static/config-setting.html

Also what is your max_wal_senders setting?

Any advice would be helpful.

Greig Wise

--
Sent from:

http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

--
Adrian Klaver
adrian.klaver@aklaver.com