pgsql: Fix signal handling in logical replication workers

Started by Peter Eisentrautabout 9 years ago5 messagescomitters
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Fix signal handling in logical replication workers

The logical replication worker processes now use the normal die()
handler for SIGTERM and CHECK_FOR_INTERRUPTS() instead of custom code.
One problem before was that the apply worker would not exit promptly
when a subscription was dropped, which could lead to deadlocks.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/9fcf670c2efdf31233d429f557ab77937f0f1e6a

Modified Files
--------------
src/backend/replication/logical/launcher.c | 16 +++++++-------
src/backend/replication/logical/tablesync.c | 10 ++++-----
src/backend/replication/logical/worker.c | 34 ++++++++++++++++++++++++++---
src/backend/tcop/postgres.c | 5 +++++
src/include/replication/logicalworker.h | 2 ++
src/include/replication/worker_internal.h | 4 ----
6 files changed, 50 insertions(+), 21 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: pgsql: Fix signal handling in logical replication workers

Peter Eisentraut <peter_e@gmx.net> writes:

Fix signal handling in logical replication workers

It looks like this broke buildfarm member nightjar.
Not clear why - I don't see anything especially platform-specific
in the patch.

regards, tom lane

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: pgsql: Fix signal handling in logical replication workers

I wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Fix signal handling in logical replication workers

It looks like this broke buildfarm member nightjar.
Not clear why - I don't see anything especially platform-specific
in the patch.

To muddy the waters further, I tried to duplicate the failure on
FreeBSD 11.0/x86_64, and it seems to pass just fine. Maybe Andrew
can look into why nightjar is failing.

regards, tom lane

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#4Petr Jelinek
petr@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: pgsql: Fix signal handling in logical replication workers

On 03/06/17 02:59, Tom Lane wrote:

I wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Fix signal handling in logical replication workers

It looks like this broke buildfarm member nightjar.
Not clear why - I don't see anything especially platform-specific
in the patch.

To muddy the waters further, I tried to duplicate the failure on
FreeBSD 11.0/x86_64, and it seems to pass just fine. Maybe Andrew
can look into why nightjar is failing.

There is still one locking patch pending (well pending to be written), I
would not be surprised if there is race condition in shutdown somewhere
before that's done.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Petr Jelinek (#4)
Re: pgsql: Fix signal handling in logical replication workers

On 06/02/2017 09:13 PM, Petr Jelinek wrote:

On 03/06/17 02:59, Tom Lane wrote:

I wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Fix signal handling in logical replication workers

It looks like this broke buildfarm member nightjar.
Not clear why - I don't see anything especially platform-specific
in the patch.

To muddy the waters further, I tried to duplicate the failure on
FreeBSD 11.0/x86_64, and it seems to pass just fine. Maybe Andrew
can look into why nightjar is failing.

There is still one locking patch pending (well pending to be written), I
would not be surprised if there is race condition in shutdown somewhere
before that's done.

nightjar has been having intermittent failures on the subscription tests
for some time. See
<https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=nightjar&amp;br=HEAD&gt;.
It's only been running the tests for about 53 days.

I'm prepared to give any help needed, including access to nightjar if
required.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers