Autovacuum worker doesn't immediately exit on postmaster death

Started by Alexander Kukushkinabout 5 years ago22 messages
#1Alexander Kukushkin
cyberdemn@gmail.com

Hello,

I know, nobody in their mind should do that, but, if the postmaster
process is killed with SIGKILL signal, most backend processes
correctly notice the fact of the postmaster process absence and exit.
There is one exception though, when there are autovacuum worker
processes they are continuing to run until eventually finish and exit.

Steps to reproduce:
1. Initialize the cluster and start it up

2. Create a table and fill it up with some data:
localhost/postgres=# create table foo(id int);
CREATE TABLE
localhost/postgres=# alter table foo set (autovacuum_vacuum_cost_delay = 100);
ALTER TABLE
localhost/postgres=# insert into foo select * from generate_series(1, 10000000);
INSERT 0 10000000
localhost/postgres=# \dt+ foo
List of relations
Schema │ Name │ Type │ Owner │ Persistence │ Size │ Description
────────┼──────┼───────┼──────────┼─────────────┼────────┼─────────────
public │ foo │ table │ postgres │ permanent │ 346 MB │
(1 row)

3. Wait until autovacuum worker process started and kill it:
$ ps auxwwwf | grep [p]ostgres
akukush+ 7728 0.0 0.1 321244 26836 ? S 15:51 0:00
postgres -D data
akukush+ 7730 0.0 0.0 173488 4312 ? Ss 15:51 0:00 \_
postgres: logger
akukush+ 7732 0.0 0.0 321584 8808 ? Ss 15:51 0:00 \_
postgres: checkpointer
akukush+ 7733 0.0 0.4 321376 70688 ? Ss 15:51 0:00 \_
postgres: background writer
akukush+ 7734 0.0 0.0 321244 9780 ? Ss 15:51 0:00 \_
postgres: walwriter
akukush+ 7735 0.0 0.0 321796 6684 ? Ss 15:51 0:00 \_
postgres: autovacuum launcher
akukush+ 7736 0.0 0.0 175608 6224 ? Ss 15:51 0:00 \_
postgres: archiver last was 0000002E000000000000002A
akukush+ 7737 0.0 0.0 175608 4340 ? Ss 15:51 0:00 \_
postgres: stats collector
akukush+ 7738 0.0 0.0 321672 6812 ? Ss 15:51 0:00 \_
postgres: logical replication launcher
akukush+ 7743 0.0 0.0 322460 14624 ? Ss 15:52 0:00 \_
postgres: postgres postgres 127.0.0.1(39130) idle

$ ps auxwwwf | grep [p]ostgres | grep auto
akukush+ 7735 0.0 0.0 321796 6684 ? Ss 15:51 0:00 \_
postgres: autovacuum launcher
akukush+ 10483 1.0 0.0 322432 12472 ? Ss 16:28 0:00 \_
postgres: autovacuum worker postgres

$ kill -9 7728

$ ps auxwwwf | grep [p]ostgres | grep auto
akukush+ 10483 0.7 0.0 322432 12472 ? Ss 16:28 0:00
postgres: autovacuum worker postgres

And here is gdb backtrace:
(gdb) bt
#0 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=0,
readfds=readfds@entry=0x0, writefds=writefds@entry=0x0,
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1b580)
at ../sysdeps/unix/sysv/linux/select.c:41
#1 0x000055dbd93ade2d in pg_usleep (microsec=<optimized out>) at
./build/../src/port/pgsleep.c:56
#2 0x000055dbd90b7543 in vacuum_delay_point () at
./build/../src/backend/commands/vacuum.c:2034
#3 0x000055dbd8f5c00d in lazy_scan_heap (aggressive=false,
nindexes=0, Irel=0x0, vacrelstats=<optimized out>,
params=0x55dbdaac7e7c, onerel=<optimized out>) at
./build/../src/backend/access/heap/vacuumlazy.c:1034
#4 heap_vacuum_rel (onerel=<optimized out>, params=0x55dbdaac7e7c,
bstrategy=<optimized out>) at
./build/../src/backend/access/heap/vacuumlazy.c:518
#5 0x000055dbd90b561d in table_relation_vacuum (bstrategy=<optimized
out>, params=0x55dbdaac7e7c, rel=0x7f6e1f3105f0) at
./build/../src/include/access/tableam.h:1460
#6 vacuum_rel (relid=16396, relation=<optimized out>,
params=params@entry=0x55dbdaac7e7c) at
./build/../src/backend/commands/vacuum.c:1893
#7 0x000055dbd90b68c5 in vacuum (relations=0x55dbdab38588,
params=params@entry=0x55dbdaac7e7c, bstrategy=<optimized out>,
bstrategy@entry=0x55dbdaac7f98, isTopLevel=isTopLevel@entry=true) at
./build/../src/backend/commands/vacuum.c:449
#8 0x000055dbd8f0777b in autovacuum_do_vac_analyze
(bstrategy=0x55dbdaac7f98, tab=0x55dbdaac7e78) at
./build/../src/backend/postmaster/autovacuum.c:3137
#9 do_autovacuum () at ./build/../src/backend/postmaster/autovacuum.c:2467
#10 0x000055dbd8f07e2a in AutoVacWorkerMain (argv=0x0, argc=0) at
./build/../src/backend/postmaster/autovacuum.c:1694
#11 0x000055dbd91a753a in StartAutoVacWorker () at
./build/../src/backend/postmaster/autovacuum.c:1488
#12 0x000055dbd91b54ca in StartAutovacuumWorker () at
./build/../src/backend/postmaster/postmaster.c:5613
#13 sigusr1_handler (postgres_signal_arg=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:5320
#14 <signal handler called>
#15 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=10,
readfds=readfds@entry=0x7fff6cf1c7f0, writefds=writefds@entry=0x0,
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1c750)
at ../sysdeps/unix/sysv/linux/select.c:41
#16 0x000055dbd91b5759 in ServerLoop () at
./build/../src/backend/postmaster/postmaster.c:1703
#17 0x000055dbd91b75e3 in PostmasterMain (argc=17, argv=<optimized
out>) at ./build/../src/backend/postmaster/postmaster.c:1412
#18 0x000055dbd8f0a3c8 in main (argc=17, argv=0x55dbdaa4aef0) at
./build/../src/backend/main/main.c:210

Fifteen minutes later process 10483 was still alive, but the backtrace
looked a bit different (it finished vacuuming and was doing analyze):
(gdb) bt
#0 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=0,
readfds=readfds@entry=0x0, writefds=writefds@entry=0x0,
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1b480)
at ../sysdeps/unix/sysv/linux/select.c:41
#1 0x000055dbd93ade2d in pg_usleep (microsec=<optimized out>) at
./build/../src/port/pgsleep.c:56
#2 0x000055dbd90b7543 in vacuum_delay_point () at
./build/../src/backend/commands/vacuum.c:2034
#3 0x000055dbd9037251 in acquire_sample_rows
(onerel=onerel@entry=0x7f6e1f310070, elevel=elevel@entry=13,
rows=rows@entry=0x7f6e1f2b8048, targrows=targrows@entry=30000,
totalrows=totalrows@entry=0x7fff6cf1b6d8,
totaldeadrows=totaldeadrows@entry=0x7fff6cf1b6e0)
at ./build/../src/backend/commands/analyze.c:1079
#4 0x000055dbd9039d51 in do_analyze_rel
(onerel=onerel@entry=0x7f6e1f310070,
params=params@entry=0x55dbdaac7e7c, va_cols=va_cols@entry=0x0,
acquirefunc=0x55dbd9037110 <acquire_sample_rows>, relpages=44248,
inh=inh@entry=false, in_outer_xact=false, elevel=13)
at ./build/../src/backend/commands/analyze.c:522
#5 0x000055dbd903b452 in analyze_rel (relid=<optimized out>,
relation=<optimized out>, params=params@entry=0x55dbdaac7e7c,
va_cols=0x0, in_outer_xact=<optimized out>, bstrategy=<optimized out>)
at ./build/../src/backend/commands/analyze.c:263
#6 0x000055dbd90b6884 in vacuum (relations=0x55dbdab38588,
params=params@entry=0x55dbdaac7e7c, bstrategy=<optimized out>,
bstrategy@entry=0x55dbdaac7f98, isTopLevel=isTopLevel@entry=true) at
./build/../src/backend/commands/vacuum.c:466
#7 0x000055dbd8f0777b in autovacuum_do_vac_analyze
(bstrategy=0x55dbdaac7f98, tab=0x55dbdaac7e78) at
./build/../src/backend/postmaster/autovacuum.c:3137
#8 do_autovacuum () at ./build/../src/backend/postmaster/autovacuum.c:2467
#9 0x000055dbd8f07e2a in AutoVacWorkerMain (argv=0x0, argc=0) at
./build/../src/backend/postmaster/autovacuum.c:1694
#10 0x000055dbd91a753a in StartAutoVacWorker () at
./build/../src/backend/postmaster/autovacuum.c:1488
#11 0x000055dbd91b54ca in StartAutovacuumWorker () at
./build/../src/backend/postmaster/postmaster.c:5613
#12 sigusr1_handler (postgres_signal_arg=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:5320
#13 <signal handler called>
#14 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=10,
readfds=readfds@entry=0x7fff6cf1c7f0, writefds=writefds@entry=0x0,
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1c750)
at ../sysdeps/unix/sysv/linux/select.c:41
#15 0x000055dbd91b5759 in ServerLoop () at
./build/../src/backend/postmaster/postmaster.c:1703
#16 0x000055dbd91b75e3 in PostmasterMain (argc=17, argv=<optimized
out>) at ./build/../src/backend/postmaster/postmaster.c:1412
#17 0x000055dbd8f0a3c8 in main (argc=17, argv=0x55dbdaa4aef0) at
./build/../src/backend/main/main.c:210

Eventually, after 20-25 minutes the process 10483 exited.

I was able to reproduce it with 13.0 and 12.4, and I believe older
versions are also affected.

Regards,
--
Alexander Kukushkin

#2Victor Yegorov
vyegorov@gmail.com
In reply to: Alexander Kukushkin (#1)
Re: Autovacuum worker doesn't immediately exit on postmaster death

ср, 28 окт. 2020 г. в 19:44, Alexander Kukushkin <cyberdemn@gmail.com>:

I know, nobody in their mind should do that, but, if the postmaster
process is killed with SIGKILL signal, most backend processes
correctly notice the fact of the postmaster process absence and exit.
There is one exception though, when there are autovacuum worker
processes they are continuing to run until eventually finish and exit.

I was able to reproduce it with 13.0 and 12.4, and I believe older
versions are also affected.

Do you get the same behaviour also on master?
As there was some work in this area for 14, see
https://git.postgresql.org/pg/commitdiff/44fc6e259b

--
Victor Yegorov

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Victor Yegorov (#2)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Victor Yegorov <vyegorov@gmail.com> writes:

ср, 28 окт. 2020 г. в 19:44, Alexander Kukushkin <cyberdemn@gmail.com>:

I know, nobody in their mind should do that, but, if the postmaster
process is killed with SIGKILL signal, most backend processes
correctly notice the fact of the postmaster process absence and exit.
There is one exception though, when there are autovacuum worker
processes they are continuing to run until eventually finish and exit.

Do you get the same behaviour also on master?
As there was some work in this area for 14, see
https://git.postgresql.org/pg/commitdiff/44fc6e259b

That was about SIGQUIT response, which isn't really related to this
scenario. But I do not think Alexander has accurately characterized
the situation. *No* server processes will react instantly to postmaster
death. Typically they'll only detect it while waiting for some other
condition, such as client input, or in some cases while iterating their
outermost loop. So if they're busy with calculations they might not
notice for a long time. I don't think autovacuum is any worse than
a busy client backend on this score.

It's hard to do better than that, because on most platforms there's
no way to get a signal on parent-process death, so the only way to
notice would be to poll the postmaster-death pipe constantly; which
would be hugely expensive in comparison to the value.

On the whole I'm skeptical that this is a useful consideration to
expend effort on. You shouldn't be killing the postmaster that way.
If you do, you'll soon learn not to, for plenty of reasons besides
this one.

regards, tom lane

#4Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#3)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Victor Yegorov <vyegorov@gmail.com> writes:

ср, 28 окт. 2020 г. в 19:44, Alexander Kukushkin <cyberdemn@gmail.com>:

I know, nobody in their mind should do that, but, if the postmaster
process is killed with SIGKILL signal, most backend processes
correctly notice the fact of the postmaster process absence and exit.
There is one exception though, when there are autovacuum worker
processes they are continuing to run until eventually finish and exit.

Do you get the same behaviour also on master?
As there was some work in this area for 14, see
https://git.postgresql.org/pg/commitdiff/44fc6e259b

That was about SIGQUIT response, which isn't really related to this
scenario. But I do not think Alexander has accurately characterized
the situation. *No* server processes will react instantly to postmaster
death. Typically they'll only detect it while waiting for some other
condition, such as client input, or in some cases while iterating their
outermost loop. So if they're busy with calculations they might not
notice for a long time. I don't think autovacuum is any worse than
a busy client backend on this score.

Considering how long an autovacuum can run, it seems like it'd be
worthwhile to find a useful place to check for postmaster-death.
Typical well-running systems are going to be waiting for the client
pretty frequently and therefore this does make autovacuum stick out in
this case.

It's hard to do better than that, because on most platforms there's
no way to get a signal on parent-process death, so the only way to
notice would be to poll the postmaster-death pipe constantly; which
would be hugely expensive in comparison to the value.

I agree that 'constantly' wouldn't be great, but with some periodicity
that's more frequent than 'not until a few hours later when we finally
finish vacuuming this relation' would be nice. At least with autovauum
we may be periodically sleeping anyway so it doesn't seem like polling
at that point would really be terrible, though it'd be nice to check
every once in a while even if we aren't sleeping.

Thanks,

Stephen

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#4)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Stephen Frost <sfrost@snowman.net> writes:

I agree that 'constantly' wouldn't be great, but with some periodicity
that's more frequent than 'not until a few hours later when we finally
finish vacuuming this relation' would be nice. At least with autovauum
we may be periodically sleeping anyway so it doesn't seem like polling
at that point would really be terrible, though it'd be nice to check
every once in a while even if we aren't sleeping.

Maybe put a check into vacuum_delay_point, and poll the pipe when we're
about to sleep anyway? That wouldn't fix anything except autovacuum,
but if you're right that that's a primary pain point then it'd help.

regards, tom lane

#6Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Stephen Frost (#4)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On 2020-Oct-29, Stephen Frost wrote:

It's hard to do better than that, because on most platforms there's
no way to get a signal on parent-process death, so the only way to
notice would be to poll the postmaster-death pipe constantly; which
would be hugely expensive in comparison to the value.

I agree that 'constantly' wouldn't be great, but with some periodicity
that's more frequent than 'not until a few hours later when we finally
finish vacuuming this relation' would be nice. At least with autovauum
we may be periodically sleeping anyway so it doesn't seem like polling
at that point would really be terrible, though it'd be nice to check
every once in a while even if we aren't sleeping.

vacuum_delay_point seems an obvious candidate, as soon as we've
determined that the sleep interval is > 0; since we're going to sleep,
the cost of a syscall seems negligible. I'm not sure what to suggest
for vacuums that don't have vacuum costing active, though.

#7Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alexander Kukushkin (#1)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On 2020-Oct-28, Alexander Kukushkin wrote:

Hello,

I know, nobody in their mind should do that, but, if the postmaster
process is killed with SIGKILL signal, most backend processes
correctly notice the fact of the postmaster process absence and exit.
There is one exception though, when there are autovacuum worker
processes they are continuing to run until eventually finish and exit.

So, if you have a manual vacuum running on the table (with
vacuum_cost_delay=0) and kill -KILL the postmaster, that one also
lingers arbitrarily long afterwards?

(I suppose the problem is not as obvious just because the vacuum
wouldn't run as long, because of no vacuum cost delay; but it'd still be
a problem if you made the table bigger.)

#8Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Hi,

On 2020-10-29 12:27:53 -0400, Tom Lane wrote:

Maybe put a check into vacuum_delay_point, and poll the pipe when we're
about to sleep anyway?

Perhaps we should just replace the pg_usleep() with a latch wait?

Greetings,

Andres Freund

#9Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#8)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

On 2020-10-29 12:27:53 -0400, Tom Lane wrote:

Maybe put a check into vacuum_delay_point, and poll the pipe when we're
about to sleep anyway?

Perhaps we should just replace the pg_usleep() with a latch wait?

I'm not sure why, but I had the thought that we already had done that,
and was a bit surprised that it wasn't that way, so +1 from my part.

I do think it'd be good to find a way to check every once in a while
even when we aren't going to delay though. Not sure what the best
answer there is.

Thanks,

Stephen

#10Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Stephen Frost (#9)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On 2020-Oct-29, Stephen Frost wrote:

I do think it'd be good to find a way to check every once in a while
even when we aren't going to delay though. Not sure what the best
answer there is.

Maybe instead of thinking specifically in terms of vacuum, we could
count buffer accesses (read from kernel) and check the latch once every
1000th such, or something like that. Then a very long query doesn't
have to wait until it's run to completion. The cost is one integer
addition per syscall, which should be bearable.

(This doesn't help with a query that's running arbitrarily outside of
Postgres, or doing something that doesn't access disk -- but it'd help
with a majority of problem cases.)

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#10)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2020-Oct-29, Stephen Frost wrote:

I do think it'd be good to find a way to check every once in a while
even when we aren't going to delay though. Not sure what the best
answer there is.

Maybe instead of thinking specifically in terms of vacuum, we could
count buffer accesses (read from kernel) and check the latch once every
1000th such, or something like that. Then a very long query doesn't
have to wait until it's run to completion. The cost is one integer
addition per syscall, which should be bearable.

I'm kind of unwilling to add any syscalls at all to normal execution
code paths for this purpose. People shouldn't be sig-kill'ing the
postmaster, or if they do, cleaning up the mess is their responsibility.
I'd also suggest that adding nearly-untestable code paths for this
purpose is a fine way to add bugs we'll never catch.

The if-we're-going-to-delay-anyway path in vacuum_delay_point seems
OK to add a touch more overhead to, though.

regards, tom lane

#12Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#11)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2020-Oct-29, Stephen Frost wrote:

I do think it'd be good to find a way to check every once in a while
even when we aren't going to delay though. Not sure what the best
answer there is.

Maybe instead of thinking specifically in terms of vacuum, we could
count buffer accesses (read from kernel) and check the latch once every
1000th such, or something like that. Then a very long query doesn't
have to wait until it's run to completion. The cost is one integer
addition per syscall, which should be bearable.

I'm kind of unwilling to add any syscalls at all to normal execution
code paths for this purpose. People shouldn't be sig-kill'ing the
postmaster, or if they do, cleaning up the mess is their responsibility.
I'd also suggest that adding nearly-untestable code paths for this
purpose is a fine way to add bugs we'll never catch.

Not sure if either is at all viable, but I had a couple of thoughts
about other ways to possibly address this.

The first simplistic idea is this- we have lots of processes that pick
up pretty quickly on the postmaster going away due to checking if it's
still around while waiting for something else to happen anyway (like the
autovacuum launcher...), and we have CFI's in a lot of places where it's
reasonable to do a CFI but isn't alright to check for postmaster death.
While it'd be better if there were more platforms where parent death
would send a signal to the children, that doesn't seem to be coming any
time soon- so why don't we do it ourselves? That is, when we discover
that the postmaster has died, scan through the proc array (carefully,
since it could be garbage, but all we're looking for are the PIDs of
anything that might still be around) and try sending a signal to any
processes that are left? Those signals would hopefully get delivered
and the other backends would discover the signal through CFI and exit
reasonably quickly.

The other thought I had was around trying to check for postmaster death
when we're about to do some I/O, which would probably catch a large
number of these cases too though technically some process might stick
around for a while if it's only dealing with things that are already in
shared buffers, I suppose. Also seems complicated and expensive to do.

The if-we're-going-to-delay-anyway path in vacuum_delay_point seems
OK to add a touch more overhead to, though.

Yeah, this certainly seems reasonable to do too and on a well run system
would likely be enough 90+% of the time.

Thanks,

Stephen

#13Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#11)
1 attachment(s)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2020-Oct-29, Stephen Frost wrote:

I do think it'd be good to find a way to check every once in a while
even when we aren't going to delay though. Not sure what the best
answer there is.

Maybe instead of thinking specifically in terms of vacuum, we could
count buffer accesses (read from kernel) and check the latch once every
1000th such, or something like that. Then a very long query doesn't
have to wait until it's run to completion. The cost is one integer
addition per syscall, which should be bearable.

I'm kind of unwilling to add any syscalls at all to normal execution
code paths for this purpose. People shouldn't be sig-kill'ing the
postmaster, or if they do, cleaning up the mess is their responsibility.
I'd also suggest that adding nearly-untestable code paths for this
purpose is a fine way to add bugs we'll never catch.

The if-we're-going-to-delay-anyway path in vacuum_delay_point seems
OK to add a touch more overhead to, though.

Alright, for this part at least, seems like it'd be something like the
attached.

Only lightly tested, but does seem to address the specific example which
was brought up on this thread.

Thoughts..?

Thanks,

Stephen

Attachments:

av_exit_pm_death_v1.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 98270a1049..c90a4edb98 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2069,9 +2069,11 @@ vacuum_delay_point(void)
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
-		pgstat_report_wait_start(WAIT_EVENT_VACUUM_DELAY);
-		pg_usleep((long) (msec * 1000));
-		pgstat_report_wait_end();
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 msec,
+						 WAIT_EVENT_VACUUM_DELAY);
+		ResetLatch(MyLatch);
 
 		VacuumCostBalance = 0;
 
#14Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#10)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On Thu, Oct 29, 2020 at 5:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Maybe instead of thinking specifically in terms of vacuum, we could
count buffer accesses (read from kernel) and check the latch once every
1000th such, or something like that. Then a very long query doesn't
have to wait until it's run to completion. The cost is one integer
addition per syscall, which should be bearable.

Interesting idea. One related case is where everything is fine on the
server side but the client has disconnected and we don't notice that
the socket has changed state until something makes us try to send a
message to the client, which might be a really long time if the
server's doing like a lengthy computation before generating any rows.
It would be really nice if we could find a cheap way to check for both
postmaster death and client disconnect every now and then, like if a
single system call could somehow answer both questions.

--
Robert Haas
EDB: http://www.enterprisedb.com

#15Thomas Munro
thomas.munro@gmail.com
In reply to: Robert Haas (#14)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On Fri, Dec 11, 2020 at 8:34 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Oct 29, 2020 at 5:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Maybe instead of thinking specifically in terms of vacuum, we could
count buffer accesses (read from kernel) and check the latch once every
1000th such, or something like that. Then a very long query doesn't
have to wait until it's run to completion. The cost is one integer
addition per syscall, which should be bearable.

Interesting idea. One related case is where everything is fine on the
server side but the client has disconnected and we don't notice that
the socket has changed state until something makes us try to send a
message to the client, which might be a really long time if the
server's doing like a lengthy computation before generating any rows.
It would be really nice if we could find a cheap way to check for both
postmaster death and client disconnect every now and then, like if a
single system call could somehow answer both questions.

For the record, an alternative approach was proposed[1]/messages/by-id/77def86b27e41f0efcba411460e929ae@postgrespro.ru that
periodically checks for disconnected sockets using a timer, that will
then cause the next CFI() to abort.

Doing the check (a syscall) based on elapsed time rather than every
nth CFI() or buffer access or whatever seems better in some ways,
considering the difficulty of knowing what the frequency will be. One
of the objections was that it added unacceptable setitimer() calls.
We discussed an idea to solve that problem generally, and then later I
prototyped that idea in another thread[2]/messages/by-id/763A0689-F189-459E-946F-F0EC4458980B@hotmail.com about idle session timeouts
(not sure about that yet, comments welcome).

I've also wondered about checking postmaster_possibly_dead in CFI() on
platforms where we have it (and working to increase that set of
platforms), instead of just reacting to PM death when sleeping. But
it seems like the real problem in this specific case is the use of
pg_usleep() where WaitLatch() should be used, no?

The recovery loop is at the opposite end of the spectrum: while vacuum
doesn't check for postmaster death often enough, the recovery loop
checks potentially hundreds of thousands or millions of times per
seconds, which sucks on systems that don't have parent-death signals
and slows down recovery quite measurably. In the course of the
discussion about fixing that[3]/messages/by-id/CA+hUKGK1607VmtrDUHQXrsooU=ap4g4R2yaoByWOOA3m8xevUQ@mail.gmail.com we spotted other places that are using
a pg_usleep() where they ought to be using WaitLatch() (which comes
with exit-on-PM-death behaviour built-in). By the way, the patch in
that thread does almost what Robert described, namely check for PM
death every nth time (which in this case means every nth WAL record),
except it's not in the main CFI(), it's in a special variant used just
for recovery.

[1]: /messages/by-id/77def86b27e41f0efcba411460e929ae@postgrespro.ru
[2]: /messages/by-id/763A0689-F189-459E-946F-F0EC4458980B@hotmail.com
[3]: /messages/by-id/CA+hUKGK1607VmtrDUHQXrsooU=ap4g4R2yaoByWOOA3m8xevUQ@mail.gmail.com

#16Thomas Munro
thomas.munro@gmail.com
In reply to: Stephen Frost (#13)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On Fri, Dec 11, 2020 at 7:57 AM Stephen Frost <sfrost@snowman.net> wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

The if-we're-going-to-delay-anyway path in vacuum_delay_point seems
OK to add a touch more overhead to, though.

Alright, for this part at least, seems like it'd be something like the
attached.

Only lightly tested, but does seem to address the specific example which
was brought up on this thread.

Thoughts..?

+1

#17Stephen Frost
sfrost@snowman.net
In reply to: Thomas Munro (#16)
1 attachment(s)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Thomas Munro (thomas.munro@gmail.com) wrote:

On Fri, Dec 11, 2020 at 7:57 AM Stephen Frost <sfrost@snowman.net> wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

The if-we're-going-to-delay-anyway path in vacuum_delay_point seems
OK to add a touch more overhead to, though.

Alright, for this part at least, seems like it'd be something like the
attached.

Only lightly tested, but does seem to address the specific example which
was brought up on this thread.

Thoughts..?

+1

Thanks for that. Attached is just a rebased version with a commit
message added. If there aren't any other concerns, I'll commit this in
the next few days and back-patch it. When it comes to 12 and older,
does anyone want to opine about the wait event to use? I was thinking
PG_WAIT_TIMEOUT or WAIT_EVENT_PG_SLEEP ...

Or do folks think this shouldn't be backpatched? That would mean it
wouldn't help anyone for years, which would be pretty unfortuante, hence
my feeling that it's worthwhile to backpatch.

Thanks!

Stephen

Attachments:

av_exit_pm_death_v2.patchtext/x-diff; charset=us-asciiDownload
From 9daf52b78d106c86e038dcefdb1d8345d22b9756 Mon Sep 17 00:00:00 2001
From: Stephen Frost <sfrost@snowman.net>
Date: Mon, 22 Mar 2021 13:25:57 -0400
Subject: [PATCH] Use a WaitLatch for vacuum/autovacuum sleeping

Instead of using pg_usleep() in vacuum_delay_point(), use a WaitLatch.
This has the advantage that we will realize if the postmaster has been
killed since the last time we decided to sleep while vacuuming.

Discussion: https://postgr.es/m/CAFh8B=kcdk8k-Y21RfXPu5dX=bgPqJ8TC3p_qxR_ygdBS=JN5w@mail.gmail.com
Backpatch: 9.6-
---
 src/backend/commands/vacuum.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c064352e23..662aff04b4 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2080,9 +2080,11 @@ vacuum_delay_point(void)
 		if (msec > VacuumCostDelay * 4)
 			msec = VacuumCostDelay * 4;
 
-		pgstat_report_wait_start(WAIT_EVENT_VACUUM_DELAY);
-		pg_usleep((long) (msec * 1000));
-		pgstat_report_wait_end();
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 msec,
+						 WAIT_EVENT_VACUUM_DELAY);
+		ResetLatch(MyLatch);
 
 		VacuumCostBalance = 0;
 
-- 
2.27.0

#18Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#17)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On Mon, Mar 22, 2021 at 1:48 PM Stephen Frost <sfrost@snowman.net> wrote:

Thanks for that. Attached is just a rebased version with a commit
message added. If there aren't any other concerns, I'll commit this in
the next few days and back-patch it. When it comes to 12 and older,
does anyone want to opine about the wait event to use? I was thinking
PG_WAIT_TIMEOUT or WAIT_EVENT_PG_SLEEP ...

I'm not sure if we should back-patch this, but I think if you do you
should just add a wait event, rather than using a generic one.

--
Robert Haas
EDB: http://www.enterprisedb.com

#19Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#18)
Re: Autovacuum worker doesn't immediately exit on postmaster death

On Mon, Mar 22, 2021 at 04:07:12PM -0400, Robert Haas wrote:

On Mon, Mar 22, 2021 at 1:48 PM Stephen Frost <sfrost@snowman.net> wrote:

Thanks for that. Attached is just a rebased version with a commit
message added. If there aren't any other concerns, I'll commit this in
the next few days and back-patch it. When it comes to 12 and older,
does anyone want to opine about the wait event to use? I was thinking
PG_WAIT_TIMEOUT or WAIT_EVENT_PG_SLEEP ...

I'm not sure if we should back-patch this, but I think if you do you
should just add a wait event, rather than using a generic one.

I would not back-patch that either, as this is an improvement of the
current state. I agree that this had better introduce a new wait
event. Even if this stuff gets backpatched, you won't introduce an
ABI incompatibility with a new event as long as you add the new event
at the end of the existing enum lists, but let's keep the wait events
ordered on HEAD.
--
Michael

#20Stephen Frost
sfrost@snowman.net
In reply to: Michael Paquier (#19)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:

On Mon, Mar 22, 2021 at 04:07:12PM -0400, Robert Haas wrote:

On Mon, Mar 22, 2021 at 1:48 PM Stephen Frost <sfrost@snowman.net> wrote:

Thanks for that. Attached is just a rebased version with a commit
message added. If there aren't any other concerns, I'll commit this in
the next few days and back-patch it. When it comes to 12 and older,
does anyone want to opine about the wait event to use? I was thinking
PG_WAIT_TIMEOUT or WAIT_EVENT_PG_SLEEP ...

I'm not sure if we should back-patch this, but I think if you do you
should just add a wait event, rather than using a generic one.

I would not back-patch that either, as this is an improvement of the
current state. I agree that this had better introduce a new wait
event. Even if this stuff gets backpatched, you won't introduce an
ABI incompatibility with a new event as long as you add the new event
at the end of the existing enum lists, but let's keep the wait events
ordered on HEAD.

Adding CFI's in places that really should have them is something we
certainly have back-patched in the past, and that's just 'an improvement
of the current state' too, so I don't quite follow the argument being
made here that this shouldn't be back-patched.

I don't have any problem with adding into the older releases, at the end
of the existing lists, the same wait event that exists in 13+ for this
already.

Any other thoughts on this, particularly about back-patching or not..?

Thanks,

Stephen

#21Stephen Frost
sfrost@snowman.net
In reply to: Stephen Frost (#20)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Stephen Frost (sfrost@snowman.net) wrote:

* Michael Paquier (michael@paquier.xyz) wrote:

On Mon, Mar 22, 2021 at 04:07:12PM -0400, Robert Haas wrote:

On Mon, Mar 22, 2021 at 1:48 PM Stephen Frost <sfrost@snowman.net> wrote:

Thanks for that. Attached is just a rebased version with a commit
message added. If there aren't any other concerns, I'll commit this in
the next few days and back-patch it. When it comes to 12 and older,
does anyone want to opine about the wait event to use? I was thinking
PG_WAIT_TIMEOUT or WAIT_EVENT_PG_SLEEP ...

I'm not sure if we should back-patch this, but I think if you do you
should just add a wait event, rather than using a generic one.

I would not back-patch that either, as this is an improvement of the
current state. I agree that this had better introduce a new wait
event. Even if this stuff gets backpatched, you won't introduce an
ABI incompatibility with a new event as long as you add the new event
at the end of the existing enum lists, but let's keep the wait events
ordered on HEAD.

Adding CFI's in places that really should have them is something we
certainly have back-patched in the past, and that's just 'an improvement
of the current state' too, so I don't quite follow the argument being
made here that this shouldn't be back-patched.

I don't have any problem with adding into the older releases, at the end
of the existing lists, the same wait event that exists in 13+ for this
already.

Any other thoughts on this, particularly about back-patching or not..?

We seem to be at a bit of an impasse on this regarding back-patching,
which seems unfortunate to me, but without someone else commenting it
seems like it's stalled.

I'll go ahead and push the change to HEAD soon, as there doesn't seem to
be any contention regarding that.

Thanks,

Stephen

#22Stephen Frost
sfrost@snowman.net
In reply to: Stephen Frost (#21)
Re: Autovacuum worker doesn't immediately exit on postmaster death

Greetings,

* Stephen Frost (sfrost@snowman.net) wrote:

* Stephen Frost (sfrost@snowman.net) wrote:

* Michael Paquier (michael@paquier.xyz) wrote:

On Mon, Mar 22, 2021 at 04:07:12PM -0400, Robert Haas wrote:

On Mon, Mar 22, 2021 at 1:48 PM Stephen Frost <sfrost@snowman.net> wrote:

Thanks for that. Attached is just a rebased version with a commit
message added. If there aren't any other concerns, I'll commit this in
the next few days and back-patch it. When it comes to 12 and older,
does anyone want to opine about the wait event to use? I was thinking
PG_WAIT_TIMEOUT or WAIT_EVENT_PG_SLEEP ...

I'm not sure if we should back-patch this, but I think if you do you
should just add a wait event, rather than using a generic one.

I would not back-patch that either, as this is an improvement of the
current state. I agree that this had better introduce a new wait
event. Even if this stuff gets backpatched, you won't introduce an
ABI incompatibility with a new event as long as you add the new event
at the end of the existing enum lists, but let's keep the wait events
ordered on HEAD.

Adding CFI's in places that really should have them is something we
certainly have back-patched in the past, and that's just 'an improvement
of the current state' too, so I don't quite follow the argument being
made here that this shouldn't be back-patched.

I don't have any problem with adding into the older releases, at the end
of the existing lists, the same wait event that exists in 13+ for this
already.

Any other thoughts on this, particularly about back-patching or not..?

We seem to be at a bit of an impasse on this regarding back-patching,
which seems unfortunate to me, but without someone else commenting it
seems like it's stalled.

I'll go ahead and push the change to HEAD soon, as there doesn't seem to
be any contention regarding that.

Done.

Thanks!

Stephen