pgsql: Unify SIGHUP handling between normal and walsender backends.

Started by Andres Freundabout 9 years ago5 messagescomitters
Jump to latest
#1Andres Freund
andres@anarazel.de

Unify SIGHUP handling between normal and walsender backends.

Because walsender and normal backends share the same main loop it's
problematic to have two different flag variables, set in signal
handlers, indicating a pending configuration reload. Only certain
walsender commands reach code paths checking for the
variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
... LOGICAL, notably not base backups).

This is a bug present since the introduction of walsender, but has
gotten worse in releases since then which allow walsender to do more.

A later patch, not slated for v10, will similarly unify SIGHUP
handling in other types of processes as well.

Author: Petr Jelinek, Andres Freund
Reviewed-By: Michael Paquier
Discussion: /messages/by-id/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
Backpatch: 9.2-, bug is present since 9.0

Branch
------
REL9_6_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/b8bd32a51f2dd451644175af7ae32f9bec3153f1

Modified Files
--------------
src/backend/replication/walsender.c | 29 +++++++----------------------
src/backend/tcop/postgres.c | 30 ++++++++++++++----------------
src/backend/utils/init/globals.c | 1 +
src/include/miscadmin.h | 5 +++++
4 files changed, 27 insertions(+), 38 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: pgsql: Unify SIGHUP handling between normal and walsender backends.

On 2017-06-06 02:25:18 +0000, Andres Freund wrote:

Unify SIGHUP handling between normal and walsender backends.

Because walsender and normal backends share the same main loop it's
problematic to have two different flag variables, set in signal
handlers, indicating a pending configuration reload. Only certain
walsender commands reach code paths checking for the
variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
... LOGICAL, notably not base backups).

This is a bug present since the introduction of walsender, but has
gotten worse in releases since then which allow walsender to do more.

A later patch, not slated for v10, will similarly unify SIGHUP
handling in other types of processes as well.

Author: Petr Jelinek, Andres Freund
Reviewed-By: Michael Paquier
Discussion: /messages/by-id/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
Backpatch: 9.2-, bug is present since 9.0

Branch
------
REL9_6_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/b8bd32a51f2dd451644175af7ae32f9bec3153f1

Modified Files
--------------
src/backend/replication/walsender.c | 29 +++++++----------------------
src/backend/tcop/postgres.c | 30 ++++++++++++++----------------
src/backend/utils/init/globals.c | 1 +
src/include/miscadmin.h | 5 +++++
4 files changed, 27 insertions(+), 38 deletions(-)

This commit, or one of its siblings, seemingly caused 'handfish' to fail
with a weird error message:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2017-06-06%2002%3A59%3A01

ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I. -I. -I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o walsender.o walsender.c
Assembler messages:
Fatal error: can't create walsender.o: No such file or directory
<builtin>: recipe for target 'walsender.o' failed

I'm clueless what that could be caused by, given that the rest of the
9.6 animals do not seem to be scared.

Any ideas? So far I just plan to wait till the machine runs again on
its own.

- Andres

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#2)
Re: pgsql: Unify SIGHUP handling between normal and walsender backends.

Andres Freund <andres@anarazel.de> writes:

On 2017-06-06 02:25:18 +0000, Andres Freund wrote:

Unify SIGHUP handling between normal and walsender backends.

This commit, or one of its siblings, seemingly caused 'handfish' to fail
with a weird error message:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&amp;dt=2017-06-06%2002%3A59%3A01

handfish has failed with weird irreproducible problems before, eg in

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&amp;dt=2017-04-20%2023%3A37%3A45

the first sign of trouble is

! invalid binary "/home/filiperosset/dev/build-farm-4.18/HEAD/inst/bin/psql"

I'm inclined to think it's got slightly flaky hardware.

regards, tom lane

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#4Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: pgsql: Unify SIGHUP handling between normal and walsender backends.

On 2017-06-06 19:41:10 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2017-06-06 02:25:18 +0000, Andres Freund wrote:

Unify SIGHUP handling between normal and walsender backends.

This commit, or one of its siblings, seemingly caused 'handfish' to fail
with a weird error message:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&amp;dt=2017-06-06%2002%3A59%3A01

handfish has failed with weird irreproducible problems before, eg in

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&amp;dt=2017-04-20%2023%3A37%3A45

the first sign of trouble is

! invalid binary "/home/filiperosset/dev/build-farm-4.18/HEAD/inst/bin/psql"

I'm inclined to think it's got slightly flaky hardware.

Thanks, I'd looked at a few other recent failures, and they'd looked
like proper failures.

Filipe, do you know if that machine has any troubles?

Regards,

Andres

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#5Filipe Rosset
rosset.filipe@gmail.com
In reply to: Andres Freund (#4)
Re: pgsql: Unify SIGHUP handling between normal and walsender backends.

2017-06-06 20:49 GMT-03:00 Andres Freund <andres@anarazel.de>:

On 2017-06-06 19:41:10 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2017-06-06 02:25:18 +0000, Andres Freund wrote:

Unify SIGHUP handling between normal and walsender backends.

This commit, or one of its siblings, seemingly caused 'handfish' to

fail

with a weird error message:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=

handfish&dt=2017-06-06%2002%3A59%3A01

handfish has failed with weird irreproducible problems before, eg in

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=

handfish&dt=2017-04-20%2023%3A37%3A45

the first sign of trouble is

! invalid binary "/home/filiperosset/dev/build-

farm-4.18/HEAD/inst/bin/psql"

I'm inclined to think it's got slightly flaky hardware.

Thanks, I'd looked at a few other recent failures, and they'd looked
like proper failures.

Filipe, do you know if that machine has any troubles?

Regards,

Andres

Hi guys, I'm not aware of any hardware issues in 'handfish'.

For while, I changed my crontab to run the build every hour instead of each
20 minutes, let's see how it will behave in next builds.

Cheers,
Filipe