Postgres service stops when I kill client backend on Windows

Started by Dmitry Vasilyevover 10 years ago43 messages
#1Dmitry Vasilyev
d.vasilyev@postgrespro.ru

I’ve started PostgreSQL server on Windows and then I kill client
backend’s process by taskkill the service was stopped: 

postgres=# select pg_backend_pid();
 pg_backend_pid
----------------
           1976

postgres=# \! taskkill /pid 1976 /f
SUCCESS: The process with PID 1976 has been terminated.
postgres=# select 1;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

If I kill backend’s process on Linux then service not failing. So
what’s the problem? Why PostgreSQL is so strange on Windows?

------
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Charles Clavadetscher
clavadetscher@swisspug.org
In reply to: Dmitry Vasilyev (#1)
Re: Postgres service stops when I kill client backend on Windows

Hello Dmitry

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Dmitry Vasilyev
Sent: Freitag, 9. Oktober 2015 11:52
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Postgres service stops when I kill client backend on Windows

I’ve started PostgreSQL server on Windows and then I kill client
backend’s process by taskkill the service was stopped:

postgres=# select pg_backend_pid();
pg_backend_pid
----------------
1976

postgres=# \! taskkill /pid 1976 /f
SUCCESS: The process with PID 1976 has been terminated.
postgres=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

If I kill backend’s process on Linux then service not failing. So
what’s the problem? Why PostgreSQL is so strange on Windows?

I can't say what happens on windows, but I don't undestand either why you want to kill the session you are in.
Besides that why don't you use pg_terminate_backend?

db=> select pg_backend_pid();
pg_backend_pid
----------------
8808
(1 row)

db=> select pg_terminate_backend(8808);
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
db=> select pg_backend_pid();
pg_backend_pid
----------------
8500
(1 row)

Regards
Charles

------
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Dmitry Vasilyev
d.vasilyev@postgrespro.ru
In reply to: Charles Clavadetscher (#2)
Re: Postgres service stops when I kill client backend on Windows

This code stoped server too:

postgres=# do $$ unpack p,1x8 $$ language plperlu;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

------> > 

Hello Dmitry

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owne
r@postgresql.org] On Behalf Of Dmitry Vasilyev
Sent: Freitag, 9. Oktober 2015 11:52
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Postgres service stops when I kill client
backend on Windows

I’ve started PostgreSQL server on Windows and then I kill client
backend’s process by taskkill the service was stopped:

postgres=# select pg_backend_pid();
 pg_backend_pid
----------------
           1976

postgres=# \! taskkill /pid 1976 /f
SUCCESS: The process with PID 1976 has been terminated.
postgres=# select 1;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

If I kill backend’s process on Linux then service not failing. So
what’s the problem? Why PostgreSQL is so strange on Windows?

I can't say what happens on windows, but I don't undestand either why
you want to kill the session you are in.
Besides that why don't you use pg_terminate_backend?

db=> select pg_backend_pid();
 pg_backend_pid
----------------
           8808
(1 row)

db=> select pg_terminate_backend(8808);
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
db=> select pg_backend_pid();
 pg_backend_pid
----------------
           8500
(1 row)

Regards
Charles

------
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Dmitry Vasilyev (#1)
Re: Postgres service stops when I kill client backend on Windows

On Fri, Oct 9, 2015 at 5:52 AM, Dmitry Vasilyev
<d.vasilyev@postgrespro.ru> wrote:

I’ve started PostgreSQL server on Windows and then I kill client
backend’s process by taskkill the service was stopped:

postgres=# select pg_backend_pid();
pg_backend_pid
----------------
1976

postgres=# \! taskkill /pid 1976 /f
SUCCESS: The process with PID 1976 has been terminated.
postgres=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

If I kill backend’s process on Linux then service not failing. So
what’s the problem? Why PostgreSQL is so strange on Windows?

Hmm. I'd expect that to cause a crash-and-restart cycle, just like a
SIGQUIT would cause a crash-and-restart cycle on Linux. But I would
expect the server to end up running again at the end, not stopped.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#4)
Re: Postgres service stops when I kill client backend on Windows

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Oct 9, 2015 at 5:52 AM, Dmitry Vasilyev

postgres=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Hmm. I'd expect that to cause a crash-and-restart cycle, just like a
SIGQUIT would cause a crash-and-restart cycle on Linux. But I would
expect the server to end up running again at the end, not stopped.

It *is* a crash and restart cycle, or at least no evidence to the
contrary has been provided.

Whether psql's attempt to do an immediate reconnect succeeds or not is
very strongly timing-dependent, on both Linux and Windows. It's easy
for it to attempt the reconnection before crash recovery is complete,
and then you get the above symptom. Personally I get a "Failed" result
more often than not, regardless of platform.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Dmitry Vasilyev
d.vasilyev@postgrespro.ru
In reply to: Tom Lane (#5)
Re: Postgres service stops when I kill client backend on Windows

I have written, what service stopped. This action is repeatable.
You can run command 'psql -c "do $$ unpack p,1x8 $$ language plperlu;"'
and after this windows service will stop. 

On Сб, 2015-10-10 at 10:23 -0500, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Oct 9, 2015 at 5:52 AM, Dmitry Vasilyev

postgres=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Hmm.  I'd expect that to cause a crash-and-restart cycle, just like
a
SIGQUIT would cause a crash-and-restart cycle on Linux.  But I
would
expect the server to end up running again at the end, not stopped.

It *is* a crash and restart cycle, or at least no evidence to the
contrary has been provided.

Whether psql's attempt to do an immediate reconnect succeeds or not
is
very strongly timing-dependent, on both Linux and Windows.  It's easy
for it to attempt the reconnection before crash recovery is complete,
and then you get the above symptom.  Personally I get a "Failed"
result
more often than not, regardless of platform.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dmitry Vasilyev (#6)
Re: Postgres service stops when I kill client backend on Windows

Dmitry Vasilyev <d.vasilyev@postgrespro.ru> writes:

I have written, what service stopped. This action is repeatable.
You can run command 'psql -c "do $$ unpack p,1x8 $$ language plperlu;"'
and after this windows service will stop.

Well, (a) that probably means that your plperl installation is broken,
and (b) you still haven't convinced me that you had an actual service
stop, and not just that the recovery time was longer than psql would
wait before retrying the connection. Can you start a fresh psql
session after waiting a few seconds?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Dmitry Vasilyev
d.vasilyev@postgrespro.ru
In reply to: Tom Lane (#7)
Re: Postgres service stops when I kill client backend on Windows

Hello Tom!

On Сб, 2015-10-10 at 10:55 -0500, Tom Lane wrote:

Dmitry Vasilyev <d.vasilyev@postgrespro.ru> writes:

I have written, what service stopped. This action is repeatable.
You can run command 'psql -c "do $$ unpack p,1x8 $$ language
plperlu;"'
and after this windows service will stop.

Well, (a) that probably means that your plperl installation is
broken,
and (b) you still haven't convinced me that you had an actual service
stop, and not just that the recovery time was longer than psql would
wait before retrying the connection.  Can you start a fresh psql
session after waiting a few seconds?

regards, tom lane

This is knowned bug of perl:

perl -e ' unpack p,1x8'
Segmentation fault (core dumped)

backend of postgres is crashed, and windows service is stopped:

C:\Users\vadv>sc query postgresql-X64-9.4 | findstr /i "STATE"
        S
TATE              : 1  STOPPED

The log you can see bellow:

2015-10-10 19:00:13 AST LOG:  database system was interrupted; last
known up at 2015-10-10 18:54:47 AST
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  checkpoint record is at 0/16A01C8
2015-10-10 19:00:13 AST DEBUG:  redo record is at 0/16A01C8; shutdown
TRUE
2015-10-10 19:00:13 AST DEBUG:  next transaction ID: 0/678; next OID:
16393
2015-10-10 19:00:13 AST DEBUG:  next MultiXactId: 1; next
MultiXactOffset: 0
2015-10-10 19:00:13 AST DEBUG:  oldest unfrozen transaction ID: 667, in
database 1
2015-10-10 19:00:13 AST DEBUG:  oldest MultiXactId: 1, in database 1
2015-10-10 19:00:13 AST DEBUG:  transaction ID wrap limit is
2147484314, limited by database with OID 1
2015-10-10 19:00:13 AST DEBUG:  MultiXactId wrap limit is 2147483648,
limited by database with OID 1
2015-10-10 19:00:13 AST DEBUG:  starting up replication slots
2015-10-10 19:00:13 AST LOG:  database system was not properly shut
down; automatic recovery in progress
2015-10-10 19:00:13 AST DEBUG:  resetting unlogged relations: cleanup 1
init 0
2015-10-10 19:00:13 AST LOG:  redo starts at 0/16A0230
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12057; tid 0/3
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12059; tid 1/3
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12060; tid 1/2
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11979; tid 31/63
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11984; tid 16/34
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11889; tid 67/5
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11894; tid 9/132
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11895; tid 18/81
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12003; tid 48/62
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12005; tid 28/16
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12006; tid 27/24
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11950; tid 0/5
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11952; tid 1/3
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11953; tid 1/5
2015-10-10 19:00:13 AST LOG:  record with zero length at 0/16AB308
2015-10-10 19:00:13 AST LOG:  redo done at 0/16AB2D8
2015-10-10 19:00:13 AST LOG:  last completed transaction was at log
time 2015-10-10 18:55:09.464+03
2015-10-10 19:00:13 AST DEBUG:  resetting unlogged relations: cleanup 0
init 1
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  performing replication slot checkpoint
2015-10-10 19:00:13 AST DEBUG:  attempting to remove WAL segments older
than log file 000000000000000000000000
2015-10-10 19:00:13 AST DEBUG:  SlruScanDirectory invoking callback on
pg_multixact/offsets/0000
2015-10-10 19:00:13 AST DEBUG:  SlruScanDirectory invoking callback on
pg_multixact/members/0000
2015-10-10 19:00:13 AST DEBUG:  SlruScanDirectory invoking callback on
pg_multixact/offsets/0000
2015-10-10 19:00:13 AST DEBUG:  oldest MultiXactId member is at offset
0
2015-10-10 19:00:13 AST LOG:  MultiXact member wraparound protections
are now enabled
2015-10-10 19:00:13 AST DEBUG:  MultiXact member stop limit is now
4294914944 based on MultiXact 1
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(0): 1 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(0): 3 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  proc_exit(0): 2 callbacks to make
2015-10-10 19:00:13 AST DEBUG:  exit(0)
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:13 AST DEBUG:  reaping dead processes
2015-10-10 19:00:13 AST LOG:  database system is ready to accept
connections
2015-10-10 19:00:13 AST LOG:  autovacuum launcher started
2015-10-10 19:00:13 AST DEBUG:  InitPostgres
2015-10-10 19:00:13 AST DEBUG:  my backend ID is 1
2015-10-10 19:00:13 AST DEBUG:  checkpointer updated shared memory
configuration values
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  forked new backend, pid=3432
socket=1288
2015-10-10 19:00:13 AST DEBUG:  StartTransaction
2015-10-10 19:00:13 AST DEBUG:  name: unnamed;
blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:13 AST DEBUG:  CommitTransaction
2015-10-10 19:00:13 AST DEBUG:  name: unnamed;
blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  received inquiry for database 0
2015-10-10 19:00:13 AST DEBUG:  writing stats file
"pg_stat_tmp/global.stat"
2015-10-10 19:00:13 AST DEBUG:  postgres child[3432]: starting with (
2015-10-10 19:00:13 AST DEBUG:   postgres
2015-10-10 19:00:13 AST DEBUG:  )
2015-10-10 19:00:13 AST DEBUG:  InitPostgres
2015-10-10 19:00:13 AST DEBUG:  my backend ID is 2
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  StartTransaction
2015-10-10 19:00:13 AST DEBUG:  name: unnamed;
blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:13 AST FATAL:  role "WIN-TDLBFCTPHT0$" does not exist
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(1): 1 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(1): 6 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  proc_exit(1): 3 callbacks to make
2015-10-10 19:00:13 AST DEBUG:  exit(1)
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:13 AST DEBUG:  reaping dead processes
2015-10-10 19:00:13 AST DEBUG:  server process (PID 3432) exited with
exit code 1
2015-10-10 19:00:16 AST DEBUG:  forked new backend, pid=148 socket=1288
2015-10-10 19:00:16 AST DEBUG:  postgres child[148]: starting with (
2015-10-10 19:00:16 AST DEBUG:   postgres
2015-10-10 19:00:16 AST DEBUG:  )
2015-10-10 19:00:16 AST DEBUG:  InitPostgres
2015-10-10 19:00:16 AST DEBUG:  my backend ID is 2
2015-10-10 19:00:16 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:16 AST DEBUG:  StartTransaction
2015-10-10 19:00:16 AST DEBUG:  name: unnamed;
blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:16 AST FATAL:  role "vadv" does not exist
2015-10-10 19:00:16 AST DEBUG:  shmem_exit(1): 1 before_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG:  shmem_exit(1): 6 on_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG:  proc_exit(1): 3 callbacks to make
2015-10-10 19:00:16 AST DEBUG:  exit(1)
2015-10-10 19:00:16 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:16 AST DEBUG:  reaping dead processes
2015-10-10 19:00:16 AST DEBUG:  server process (PID 148) exited with
exit code 1
2015-10-10 19:00:20 AST DEBUG:  forked new backend, pid=5024
socket=1288
2015-10-10 19:00:20 AST DEBUG:  postgres child[5024]: starting with (
2015-10-10 19:00:20 AST DEBUG:   postgres
2015-10-10 19:00:20 AST DEBUG:  )
2015-10-10 19:00:20 AST DEBUG:  InitPostgres
2015-10-10 19:00:20 AST DEBUG:  my backend ID is 2
2015-10-10 19:00:20 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:20 AST DEBUG:  StartTransaction
2015-10-10 19:00:20 AST DEBUG:  name: unnamed;
blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:20 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:20 AST DEBUG:  CommitTransaction
2015-10-10 19:00:20 AST DEBUG:  name: unnamed;
blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:32 AST DEBUG:  StartTransactionCommand
2015-10-10 19:00:32 AST STATEMENT:  do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG:  StartTransaction
2015-10-10 19:00:32 AST STATEMENT:  do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG:  name: unnamed;
blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:32 AST STATEMENT:  do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG:  ProcessUtility
2015-10-10 19:00:32 AST STATEMENT:  do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG:  reaping dead processes
2015-10-10 19:00:32 AST DEBUG:  server process (PID 5024) was
terminated by exception 0xC0000005
2015-10-10 19:00:32 AST DETAIL:  Failed process was running: do $$
unpack p,1x8 $$ language plperlu;
2015-10-10 19:00:32 AST HINT:  See C include file "ntstatus.h" for a
description of the hexadecimal value.
2015-10-10 19:00:32 AST LOG:  server process (PID 5024) was terminated
by exception 0xC0000005
2015-10-10 19:00:32 AST DETAIL:  Failed process was running: do $$
unpack p,1x8 $$ language plperlu;
2015-10-10 19:00:32 AST HINT:  See C include file "ntstatus.h" for a
description of the hexadecimal value.
2015-10-10 19:00:32 AST LOG:  terminating any other active server
processes
2015-10-10 19:00:32 AST DEBUG:  sending SIGQUIT to process 1848
2015-10-10 19:00:32 AST DEBUG:  sending SIGQUIT to process 968
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  sending SIGQUIT to process 1100
2015-10-10 19:00:32 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG:  sending SIGQUIT to process 1856
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG:  sending SIGQUIT to process 1104
2015-10-10 19:00:32 AST WARNING:  terminating connection because of
crash of another server process
2015-10-10 19:00:32 AST DETAIL:  The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2015-10-10 19:00:32 AST HINT:  In a moment you should be able to
reconnect to the database and repeat your command.
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG:  writing stats file
"pg_stat/global.stat"
2015-10-10 19:00:32 AST DEBUG:  reaping dead processes
2015-10-10 19:00:32 AST DEBUG:  writing stats file
"pg_stat/db_12135.stat"
2015-10-10 19:00:32 AST DEBUG:  reaping dead processes
2015-10-10 19:00:32 AST DEBUG:  removing temporary stats file
"pg_stat_tmp/db_12135.stat"
2015-10-10 19:00:32 AST DEBUG:  writing stats file "pg_stat/db_0.stat"
2015-10-10 19:00:32 AST DEBUG:  reaping dead processes
2015-10-10 19:00:32 AST DEBUG:  removing temporary stats file
"pg_stat_tmp/db_0.stat"
2015-10-10 19:00:32 AST DEBUG:  reaping dead processes
2015-10-10 19:00:32 AST DEBUG:  reaping dead processes
2015-10-10 19:00:32 AST LOG:  all server processes terminated;
reinitializing
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  shmem_exit(1): 3 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG:  cleaning up dynamic shared memory
control segment with ID 851401618
2015-10-10 19:00:32 AST DEBUG:  invoking
IpcMemoryCreate(size=290095104)
2015-10-10 19:00:42 AST FATAL:  pre-existing shared memory block is
still in use
2015-10-10 19:00:42 AST HINT:  Check if there are any old server
processes still running, and terminate them.
2015-10-10 19:00:42 AST DEBUG:  shmem_exit(1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG:  shmem_exit(1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG:  proc_exit(1): 2 callbacks to make
2015-10-10 19:00:42 AST DEBUG:  exit(1)
2015-10-10 19:00:42 AST DEBUG:  shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG:  shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG:  proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:42 AST DEBUG:  logger shutting down
2015-10-10 19:00:42 AST DEBUG:  shmem_exit(0): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG:  shmem_exit(0): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG:  proc_exit(0): 0 callbacks to make
2015-10-10 19:00:42 AST DEBUG:  exit(0)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Dmitry Vasilyev (#8)
Re: Postgres service stops when I kill client backend on Windows

2015-10-10 18:04 GMT+02:00 Dmitry Vasilyev <d.vasilyev@postgrespro.ru>:

Hello Tom!

On Сб, 2015-10-10 at 10:55 -0500, Tom Lane wrote:

Dmitry Vasilyev <d.vasilyev@postgrespro.ru> writes:

I have written, what service stopped. This action is repeatable.
You can run command 'psql -c "do $$ unpack p,1x8 $$ language
plperlu;"'
and after this windows service will stop.

Well, (a) that probably means that your plperl installation is
broken,
and (b) you still haven't convinced me that you had an actual service
stop, and not just that the recovery time was longer than psql would
wait before retrying the connection. Can you start a fresh psql
session after waiting a few seconds?

regards, tom lane

This is knowned bug of perl:

perl -e ' unpack p,1x8'
Segmentation fault (core dumped)

so it is expected behave. After any unexpected client fails, the server is
restarted

Regards

Pavel

Show quoted text

backend of postgres is crashed, and windows service is stopped:

C:\Users\vadv>sc query postgresql-X64-9.4 | findstr /i "STATE"
S
TATE : 1 STOPPED

The log you can see bellow:

2015-10-10 19:00:13 AST LOG: database system was interrupted; last
known up at 2015-10-10 18:54:47 AST
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG: checkpoint record is at 0/16A01C8
2015-10-10 19:00:13 AST DEBUG: redo record is at 0/16A01C8; shutdown
TRUE
2015-10-10 19:00:13 AST DEBUG: next transaction ID: 0/678; next OID:
16393
2015-10-10 19:00:13 AST DEBUG: next MultiXactId: 1; next
MultiXactOffset: 0
2015-10-10 19:00:13 AST DEBUG: oldest unfrozen transaction ID: 667, in
database 1
2015-10-10 19:00:13 AST DEBUG: oldest MultiXactId: 1, in database 1
2015-10-10 19:00:13 AST DEBUG: transaction ID wrap limit is
2147484314, limited by database with OID 1
2015-10-10 19:00:13 AST DEBUG: MultiXactId wrap limit is 2147483648,
limited by database with OID 1
2015-10-10 19:00:13 AST DEBUG: starting up replication slots
2015-10-10 19:00:13 AST LOG: database system was not properly shut
down; automatic recovery in progress
2015-10-10 19:00:13 AST DEBUG: resetting unlogged relations: cleanup 1
init 0
2015-10-10 19:00:13 AST LOG: redo starts at 0/16A0230
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/12057; tid 0/3
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/12059; tid 1/3
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/12060; tid 1/2
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11979; tid 31/63
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11984; tid 16/34
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11889; tid 67/5
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11894; tid 9/132
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11895; tid 18/81
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/12003; tid 48/62
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/12005; tid 28/16
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/12006; tid 27/24
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11950; tid 0/5
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11952; tid 1/3
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT: xlog redo insert: rel
1663/12135/11953; tid 1/5
2015-10-10 19:00:13 AST LOG: record with zero length at 0/16AB308
2015-10-10 19:00:13 AST LOG: redo done at 0/16AB2D8
2015-10-10 19:00:13 AST LOG: last completed transaction was at log
time 2015-10-10 18:55:09.464+03
2015-10-10 19:00:13 AST DEBUG: resetting unlogged relations: cleanup 0
init 1
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG: performing replication slot checkpoint
2015-10-10 19:00:13 AST DEBUG: attempting to remove WAL segments older
than log file 000000000000000000000000
2015-10-10 19:00:13 AST DEBUG: SlruScanDirectory invoking callback on
pg_multixact/offsets/0000
2015-10-10 19:00:13 AST DEBUG: SlruScanDirectory invoking callback on
pg_multixact/members/0000
2015-10-10 19:00:13 AST DEBUG: SlruScanDirectory invoking callback on
pg_multixact/offsets/0000
2015-10-10 19:00:13 AST DEBUG: oldest MultiXactId member is at offset
0
2015-10-10 19:00:13 AST LOG: MultiXact member wraparound protections
are now enabled
2015-10-10 19:00:13 AST DEBUG: MultiXact member stop limit is now
4294914944 based on MultiXact 1
2015-10-10 19:00:13 AST DEBUG: shmem_exit(0): 1 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: shmem_exit(0): 3 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: proc_exit(0): 2 callbacks to make
2015-10-10 19:00:13 AST DEBUG: exit(0)
2015-10-10 19:00:13 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:13 AST DEBUG: reaping dead processes
2015-10-10 19:00:13 AST LOG: database system is ready to accept
connections
2015-10-10 19:00:13 AST LOG: autovacuum launcher started
2015-10-10 19:00:13 AST DEBUG: InitPostgres
2015-10-10 19:00:13 AST DEBUG: my backend ID is 1
2015-10-10 19:00:13 AST DEBUG: checkpointer updated shared memory
configuration values
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG: forked new backend, pid=3432
socket=1288
2015-10-10 19:00:13 AST DEBUG: StartTransaction
2015-10-10 19:00:13 AST DEBUG: name: unnamed;
blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:13 AST DEBUG: CommitTransaction
2015-10-10 19:00:13 AST DEBUG: name: unnamed;
blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG: received inquiry for database 0
2015-10-10 19:00:13 AST DEBUG: writing stats file
"pg_stat_tmp/global.stat"
2015-10-10 19:00:13 AST DEBUG: postgres child[3432]: starting with (
2015-10-10 19:00:13 AST DEBUG: postgres
2015-10-10 19:00:13 AST DEBUG: )
2015-10-10 19:00:13 AST DEBUG: InitPostgres
2015-10-10 19:00:13 AST DEBUG: my backend ID is 2
2015-10-10 19:00:13 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG: StartTransaction
2015-10-10 19:00:13 AST DEBUG: name: unnamed;
blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:13 AST FATAL: role "WIN-TDLBFCTPHT0$" does not exist
2015-10-10 19:00:13 AST DEBUG: shmem_exit(1): 1 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: shmem_exit(1): 6 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: proc_exit(1): 3 callbacks to make
2015-10-10 19:00:13 AST DEBUG: exit(1)
2015-10-10 19:00:13 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:13 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:13 AST DEBUG: reaping dead processes
2015-10-10 19:00:13 AST DEBUG: server process (PID 3432) exited with
exit code 1
2015-10-10 19:00:16 AST DEBUG: forked new backend, pid=148 socket=1288
2015-10-10 19:00:16 AST DEBUG: postgres child[148]: starting with (
2015-10-10 19:00:16 AST DEBUG: postgres
2015-10-10 19:00:16 AST DEBUG: )
2015-10-10 19:00:16 AST DEBUG: InitPostgres
2015-10-10 19:00:16 AST DEBUG: my backend ID is 2
2015-10-10 19:00:16 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:16 AST DEBUG: StartTransaction
2015-10-10 19:00:16 AST DEBUG: name: unnamed;
blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:16 AST FATAL: role "vadv" does not exist
2015-10-10 19:00:16 AST DEBUG: shmem_exit(1): 1 before_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG: shmem_exit(1): 6 on_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG: proc_exit(1): 3 callbacks to make
2015-10-10 19:00:16 AST DEBUG: exit(1)
2015-10-10 19:00:16 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:16 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:16 AST DEBUG: reaping dead processes
2015-10-10 19:00:16 AST DEBUG: server process (PID 148) exited with
exit code 1
2015-10-10 19:00:20 AST DEBUG: forked new backend, pid=5024
socket=1288
2015-10-10 19:00:20 AST DEBUG: postgres child[5024]: starting with (
2015-10-10 19:00:20 AST DEBUG: postgres
2015-10-10 19:00:20 AST DEBUG: )
2015-10-10 19:00:20 AST DEBUG: InitPostgres
2015-10-10 19:00:20 AST DEBUG: my backend ID is 2
2015-10-10 19:00:20 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:20 AST DEBUG: StartTransaction
2015-10-10 19:00:20 AST DEBUG: name: unnamed;
blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:20 AST DEBUG: mapped win32 error code 2 to 2
2015-10-10 19:00:20 AST DEBUG: CommitTransaction
2015-10-10 19:00:20 AST DEBUG: name: unnamed;
blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:32 AST DEBUG: StartTransactionCommand
2015-10-10 19:00:32 AST STATEMENT: do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG: StartTransaction
2015-10-10 19:00:32 AST STATEMENT: do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG: name: unnamed;
blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0,
nestlvl: 1, children:
2015-10-10 19:00:32 AST STATEMENT: do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG: ProcessUtility
2015-10-10 19:00:32 AST STATEMENT: do $$ unpack p,1x8 $$ language
plperlu;
2015-10-10 19:00:32 AST DEBUG: reaping dead processes
2015-10-10 19:00:32 AST DEBUG: server process (PID 5024) was
terminated by exception 0xC0000005
2015-10-10 19:00:32 AST DETAIL: Failed process was running: do $$
unpack p,1x8 $$ language plperlu;
2015-10-10 19:00:32 AST HINT: See C include file "ntstatus.h" for a
description of the hexadecimal value.
2015-10-10 19:00:32 AST LOG: server process (PID 5024) was terminated
by exception 0xC0000005
2015-10-10 19:00:32 AST DETAIL: Failed process was running: do $$
unpack p,1x8 $$ language plperlu;
2015-10-10 19:00:32 AST HINT: See C include file "ntstatus.h" for a
description of the hexadecimal value.
2015-10-10 19:00:32 AST LOG: terminating any other active server
processes
2015-10-10 19:00:32 AST DEBUG: sending SIGQUIT to process 1848
2015-10-10 19:00:32 AST DEBUG: sending SIGQUIT to process 968
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: sending SIGQUIT to process 1100
2015-10-10 19:00:32 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG: sending SIGQUIT to process 1856
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG: sending SIGQUIT to process 1104
2015-10-10 19:00:32 AST WARNING: terminating connection because of
crash of another server process
2015-10-10 19:00:32 AST DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2015-10-10 19:00:32 AST HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:32 AST DEBUG: writing stats file
"pg_stat/global.stat"
2015-10-10 19:00:32 AST DEBUG: reaping dead processes
2015-10-10 19:00:32 AST DEBUG: writing stats file
"pg_stat/db_12135.stat"
2015-10-10 19:00:32 AST DEBUG: reaping dead processes
2015-10-10 19:00:32 AST DEBUG: removing temporary stats file
"pg_stat_tmp/db_12135.stat"
2015-10-10 19:00:32 AST DEBUG: writing stats file "pg_stat/db_0.stat"
2015-10-10 19:00:32 AST DEBUG: reaping dead processes
2015-10-10 19:00:32 AST DEBUG: removing temporary stats file
"pg_stat_tmp/db_0.stat"
2015-10-10 19:00:32 AST DEBUG: reaping dead processes
2015-10-10 19:00:32 AST DEBUG: reaping dead processes
2015-10-10 19:00:32 AST LOG: all server processes terminated;
reinitializing
2015-10-10 19:00:32 AST DEBUG: shmem_exit(1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: shmem_exit(1): 3 on_shmem_exit
callbacks to make
2015-10-10 19:00:32 AST DEBUG: cleaning up dynamic shared memory
control segment with ID 851401618
2015-10-10 19:00:32 AST DEBUG: invoking
IpcMemoryCreate(size=290095104)
2015-10-10 19:00:42 AST FATAL: pre-existing shared memory block is
still in use
2015-10-10 19:00:42 AST HINT: Check if there are any old server
processes still running, and terminate them.
2015-10-10 19:00:42 AST DEBUG: shmem_exit(1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG: shmem_exit(1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG: proc_exit(1): 2 callbacks to make
2015-10-10 19:00:42 AST DEBUG: exit(1)
2015-10-10 19:00:42 AST DEBUG: shmem_exit(-1): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG: shmem_exit(-1): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG: proc_exit(-1): 0 callbacks to make
2015-10-10 19:00:42 AST DEBUG: logger shutting down
2015-10-10 19:00:42 AST DEBUG: shmem_exit(0): 0 before_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG: shmem_exit(0): 0 on_shmem_exit
callbacks to make
2015-10-10 19:00:42 AST DEBUG: proc_exit(0): 0 callbacks to make
2015-10-10 19:00:42 AST DEBUG: exit(0)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Ali Akbar
the.apaan@gmail.com
In reply to: Pavel Stehule (#9)
Re: Postgres service stops when I kill client backend on Windows

Greetings,

2015-10-11 0:18 GMT+07:00 Pavel Stehule <pavel.stehule@gmail.com>:

2015-10-10 18:04 GMT+02:00 Dmitry Vasilyev <d.vasilyev@postgrespro.ru>:

On Сб, 2015-10-10 at 10:55 -0500, Tom Lane wrote:

Dmitry Vasilyev <d.vasilyev@postgrespro.ru> writes:

I have written, what service stopped. This action is repeatable.
You can run command 'psql -c "do $$ unpack p,1x8 $$ language
plperlu;"'
and after this windows service will stop.

so it is expected behave. After any unexpected client fails, the server is
restarted

I can confirm this too. In linux (i use Fedora 22), this is what happens
when a server is killed:

=== 1. before:
$ sudo systemctl status postgresql.service
postgresql.service - PostgreSQL database server
Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled)
Active: active (running) since Jum 2015-10-09 16:25:43 WIB; 1 day 14h ago
Process: 778 ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o -p
${PGPORT} -w -t 300 (code=exited, status=0/SUCCESS)
Process: 747 ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)
Main PID: 783 (postgres)
CGroup: /system.slice/postgresql.service
├─ 783 /usr/bin/postgres -D /var/lib/pgsql/data -p 5432
├─ 812 postgres: logger process
├─ 821 postgres: checkpointer process
├─ 822 postgres: writer process
├─ 823 postgres: wal writer process
├─ 824 postgres: autovacuum launcher process
├─ 825 postgres: stats collector process
└─17181 postgres: postgres test [local] idle

=== 2. killing and attempt to reconnect:
$ sudo kill 17181

test=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

=== 3. service status after:
$ sudo systemctl status postgresql.service
postgresql.service - PostgreSQL database server
Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled)
Active: active (running) since Jum 2015-10-09 16:25:43 WIB; 1 day 14h ago
Process: 778 ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o -p
${PGPORT} -w -t 300 (code=exited, status=0/SUCCESS)
Process: 747 ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)
Main PID: 783 (postgres)
CGroup: /system.slice/postgresql.service
├─ 783 /usr/bin/postgres -D /var/lib/pgsql/data -p 5432
├─ 812 postgres: logger process
├─ 821 postgres: checkpointer process
├─ 822 postgres: writer process
├─ 823 postgres: wal writer process
├─ 824 postgres: autovacuum launcher process
├─ 825 postgres: stats collector process
└─17422 postgres: postgres test [local] idle

===

The service status is still active (running), and new process 17422 handles
the client.

But this is what happens in Windows (win 7 32 bit, postgres 9.4):

=== 1. before:
C:\Windows\system32>sc queryex postgresql-9.4

SERVICE_NAME: postgresql-9.4
TYPE : 10 WIN32_OWN_PROCESS
STATE : 4 RUNNING
(STOPPABLE, PAUSABLE, ACCEPTS_SHUTDOWN)
WIN32_EXIT_CODE : 0 (0x0)
SERVICE_EXIT_CODE : 0 (0x0)
CHECKPOINT : 0x0
WAIT_HINT : 0x0
PID : 3716
FLAGS :

=== 2. killing & attempt to reconnect:
postgres=# select pg_backend_pid();
pg_backend_pid
----------------
2080
(1 row)

C:\Windows\system32>taskkill /F /PID 2080
SUCCESS: The process with PID 2080 has been terminated.

postgres=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

=== 3. service status after:
C:\Windows\system32>sc query postgresql-9.4

SERVICE_NAME: postgresql-9.4
TYPE : 10 WIN32_OWN_PROCESS
STATE : 1 STOPPED
WIN32_EXIT_CODE : 0 (0x0)
SERVICE_EXIT_CODE : 0 (0x0)
CHECKPOINT : 0x0
WAIT_HINT : 0x0

===

The client cannot reconnect. The service is dead. This is nasty, because
any client can exploit some segfault bug like the one in perl Dmitry
mentoined upthread, and the postgresql service is down.

Note: killing the server process with pg_terminate_backend isn't causing
this behavior to happen. The client reconnects normally, and the service is
still running.

Regards,
Ali Akbar

#11Michael Paquier
michael.paquier@gmail.com
In reply to: Ali Akbar (#10)
Re: Postgres service stops when I kill client backend on Windows

On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar <the.apaan@gmail.com> wrote:

C:\Windows\system32>taskkill /F /PID 2080
SUCCESS: The process with PID 2080 has been terminated.

taskkill /f *forcefully* terminates the process targeted [1]http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true -- Michael. Isn't
that equivalent to a kill -9? If you headshot a backend process on
Linux with kill -9, an instance won't restart either.
[1]: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true -- Michael
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dmitry Vasilyev (#8)
Re: Postgres service stops when I kill client backend on Windows

Dmitry Vasilyev <d.vasilyev@postgrespro.ru> writes:

On Сб, 2015-10-10 at 10:55 -0500, Tom Lane wrote:

and (b) you still haven't convinced me that you had an actual service
stop, and not just that the recovery time was longer than psql would
wait before retrying the connection.

The log you can see bellow:
...
2015-10-10 19:00:32 AST DEBUG: cleaning up dynamic shared memory control segment with ID 851401618
2015-10-10 19:00:32 AST DEBUG: invoking IpcMemoryCreate(size=290095104)
2015-10-10 19:00:42 AST FATAL: pre-existing shared memory block is still in use
2015-10-10 19:00:42 AST HINT: Check if there are any old server processes still running, and terminate them.

Thanks for providing some detail! It's clear from the above log excerpt
that we're timing out after 10 seconds in win32_shmem.c's version of
PGSharedMemoryCreate, because CreateFileMapping is still reporting that
the old shared memory segment still exists. When we last discussed this
sort of problem in
/messages/by-id/49FA3B6F.6080906@dunslane.net
there was no evidence that such a failure could persist for longer than a
second or two. Now it seems that on your machine the failure state can
persist for at least 10 seconds, but I don't know why.

If I had to guess, on the basis of no evidence, I'd wonder whether the
DSM code broke it; there is evidently at least one DSM segment in play
in your use-case. But that's only a guess.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#12)
Re: Postgres service stops when I kill client backend on Windows

On Sun, Oct 11, 2015 at 10:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dmitry Vasilyev <d.vasilyev@postgrespro.ru> writes:

The log you can see bellow:
...
2015-10-10 19:00:32 AST DEBUG: cleaning up dynamic shared memory

control segment with ID 851401618

2015-10-10 19:00:32 AST DEBUG: invoking IpcMemoryCreate(size=290095104)
2015-10-10 19:00:42 AST FATAL: pre-existing shared memory block is

still in use

2015-10-10 19:00:42 AST HINT: Check if there are any old server

processes still running, and terminate them.

..

If I had to guess, on the basis of no evidence, I'd wonder whether the
DSM code broke it; there is evidently at least one DSM segment in play
in your use-case. But that's only a guess.

There is some possibility based on the above DEBUG messages that
DSM could cause this problem, but I think the last message (pre-existing
shared memory block is still in use) won't be logged for DSM. We create
the new dsm segment in below code dsm_postmaster_startup()->
dsm_impl_op()->dsm_impl_windows()

dsm_impl_windows()
{
..
if (op == DSM_OP_CREATE)
..
}

Basically in this path, we try to recreate the dsm with different name if it
fails with ALREADY_EXIST error.

To diagnose the reason of problem, I think we can write a diagnostic
patch which would do below 2 points:

1. Increase the below loop count 10 to 50 or 100 in win32_shmem.c
or instead of loop count, we can increase the sleep time as well.
PGSharedMemoryCreate()
{
..
for (i = 0; i < 10; i++)
..
if (GetLastError() == ERROR_ALREADY_EXISTS)
{
..
Sleep(1000);
continue;
}
..
}

2. Increase the log messages both in win32_shmem.c and dsm related
code which can help us in narrowing down the problem.

If you find this as reasonable approach to diagnose the root cause
of problem, I can work on writing a diagnostic patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#14Magnus Hagander
magnus@hagander.net
In reply to: Michael Paquier (#11)
Re: Postgres service stops when I kill client backend on Windows

On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar <the.apaan@gmail.com> wrote:

C:\Windows\system32>taskkill /F /PID 2080
SUCCESS: The process with PID 2080 has been terminated.

taskkill /f *forcefully* terminates the process targeted [1]. Isn't
that equivalent to a kill -9? If you headshot a backend process on
Linux with kill -9, an instance won't restart either.
[1]:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true

It does. If you want a "gracefull kill" on Windows, you must use "pg_ctl
kill" which can send an "emulated term-signal".

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#15Andrew Dunstan
andrew@dunslane.net
In reply to: Magnus Hagander (#14)
Re: Postgres service stops when I kill client backend on Windows

On 10/11/2015 05:58 AM, Magnus Hagander wrote:

On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier
<michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>> wrote:

On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar <the.apaan@gmail.com
<mailto:the.apaan@gmail.com>> wrote:

C:\Windows\system32>taskkill /F /PID 2080
SUCCESS: The process with PID 2080 has been terminated.

taskkill /f *forcefully* terminates the process targeted [1]. Isn't
that equivalent to a kill -9? If you headshot a backend process on
Linux with kill -9, an instance won't restart either.
[1]:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true

It does. If you want a "gracefull kill" on Windows, you must use
"pg_ctl kill" which can send an "emulated term-signal".

Nevertheless, we'd like a hard crash of a backend other than the
postmaster not to have worse effects than on *nix, where killing a
backend even with SIGKILL doesn't halt the server:

andrew=# select pg_backend_pid();
pg_backend_pid
----------------
24359
(1 row)

andrew=# \! kill -9 24359
andrew=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
andrew=#

Amit's proposals elsewhere to increase the shmem timeout and increase
logging seem reasonable.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#15)
Re: Postgres service stops when I kill client backend on Windows

Andrew Dunstan <andrew@dunslane.net> writes:

Amit's proposals elsewhere to increase the shmem timeout and increase
logging seem reasonable.

I'm back to the position I had in the previous thread, which is that
we don't really understand why any delay is needed here at all, and
we ought to try to remedy that lack rather than just hoping that more
and more delay will fix it. It may be that there's some proactive
measure we can take to improve matters.

I'm a bit suspicious that we may have leaked a handle to the shared
memory block someplace, for example. That would explain why this
symptom is visible now when it was not in 2009. Or maybe it's dependent
on some feature that we didn't test back then --- for instance, if
the logging collector is in use, could it have inherited a handle and
not closed it?

One thing I noticed in the CreateFileMapping docs is that Windows
apparently implements the sort of anonymous mapping we're doing as
a mapping of part of the "system paging file". I wonder if it's too
dumb (perhaps in only some releases) to realize that it doesn't
really need to flush dirty pages to disk when the last reference
to the mapping is abandoned. In that case maybe an explicit flush
request would move things along.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Magnus Hagander
magnus@hagander.net
In reply to: Andrew Dunstan (#15)
Re: Postgres service stops when I kill client backend on Windows

On Sun, Oct 11, 2015 at 4:32 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 10/11/2015 05:58 AM, Magnus Hagander wrote:

On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier <
michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>> wrote:

On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar <the.apaan@gmail.com
<mailto:the.apaan@gmail.com>> wrote:

C:\Windows\system32>taskkill /F /PID 2080
SUCCESS: The process with PID 2080 has been terminated.

taskkill /f *forcefully* terminates the process targeted [1]. Isn't
that equivalent to a kill -9? If you headshot a backend process on
Linux with kill -9, an instance won't restart either.
[1]:

http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true

It does. If you want a "gracefull kill" on Windows, you must use "pg_ctl
kill" which can send an "emulated term-signal".

Nevertheless, we'd like a hard crash of a backend other than the
postmaster not to have worse effects than on *nix, where killing a backend
even with SIGKILL doesn't halt the server:

Oh, absolutely. I was just pointing out that something like taskill
*should* result in a hard restart of *all* backends, and if you want to
kill off just the one you should never use it, you should instead use
pg_ctl kill. But of course, none of those two should lead to the scenario
explained here where it does not come back up again.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#18Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#16)
Re: Postgres service stops when I kill client backend on Windows

On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Amit's proposals elsewhere to increase the shmem timeout and increase
logging seem reasonable.

I'm back to the position I had in the previous thread, which is that
we don't really understand why any delay is needed here at all, and
we ought to try to remedy that lack rather than just hoping that more
and more delay will fix it. It may be that there's some proactive
measure we can take to improve matters.

I'm a bit suspicious that we may have leaked a handle to the shared
memory block someplace, for example. That would explain why this
symptom is visible now when it was not in 2009. Or maybe it's dependent
on some feature that we didn't test back then --- for instance, if
the logging collector is in use, could it have inherited a handle and
not closed it?

Even if we leaked it, it should go away when the other processes died.

What would be interesting to know is if there at this point is *any*
postgres.exe process still running, and in that case what it is. It should
then be possible to use Process Explorer to figure out which process it is
(by looking at the "fake title"), and probably also which shared memory
handles it has open (even though they don't have a name, their info might
explain things).

So if someone with a reproducible case could check that as well, I think it
woudl be valuable information.

One thing I noticed in the CreateFileMapping docs is that Windows
apparently implements the sort of anonymous mapping we're doing as
a mapping of part of the "system paging file". I wonder if it's too
dumb (perhaps in only some releases) to realize that it doesn't
really need to flush dirty pages to disk when the last reference
to the mapping is abandoned. In that case maybe an explicit flush
request would move things along.

First of all, note that "system paging file" is exactly the same as "swap
file" or "swap partition" on Unix. Just in case there is any unclearness
there.

And I'm pretty sure it doesn't do that. Surely we would've seen performance
issues from that before in that case. But I don't really have any facts to
back that up :)

We do get, AIUI, the SEC_COMMIT behaviour which commits the pages initially
to make sure there is actually space for them. I don't believe that one
specifically says anything about when you close it.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#18)
Re: Postgres service stops when I kill client backend on Windows

Magnus Hagander <magnus@hagander.net> writes:

On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm a bit suspicious that we may have leaked a handle to the shared
memory block someplace, for example. That would explain why this
symptom is visible now when it was not in 2009. Or maybe it's dependent
on some feature that we didn't test back then --- for instance, if
the logging collector is in use, could it have inherited a handle and
not closed it?

Even if we leaked it, it should go away when the other processes died.

I'm fairly certain that we do not kill/restart the logging collector
during a database restart (because it's impossible to reproduce the
original stderr destination if we do). Not sure if any other postmaster
children are allowed to survive.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Michael Paquier
michael.paquier@gmail.com
In reply to: Magnus Hagander (#14)
Re: Postgres service stops when I kill client backend on Windows

On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier wrote:

On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar <the.apaan@gmail.com> wrote:

C:\Windows\system32>taskkill /F /PID 2080
SUCCESS: The process with PID 2080 has been terminated.

taskkill /f *forcefully* terminates the process targeted [1]. Isn't
that equivalent to a kill -9? If you headshot a backend process on
Linux with kill -9, an instance won't restart either.
[1]:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true

It does. If you want a "gracefull kill" on Windows, you must use "pg_ctl
kill" which can send an "emulated term-signal".

Ah, yes. Sure. I had restart_after_crash = off on this instance...
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#19)
1 attachment(s)
Re: Postgres service stops when I kill client backend on Windows

On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm a bit suspicious that we may have leaked a handle to the shared
memory block someplace, for example. That would explain why this
symptom is visible now when it was not in 2009. Or maybe it's

dependent

on some feature that we didn't test back then --- for instance, if
the logging collector is in use, could it have inherited a handle and
not closed it?

Even if we leaked it, it should go away when the other processes died.

I'm fairly certain that we do not kill/restart the logging collector
during a database restart (because it's impossible to reproduce the
original stderr destination if we do).

True and it seems this is the reason for issue we are discussing here.
The reason why this happens is that during creation of shared memory
(PGSharedMemoryCreate()), we duplicate the handle such that it
become inheritable by all child processes. Then during fork
(syslogger_forkexec()->postmaster_forkexec()->internal_forkexec) we
always inherit the handles which causes syslogger to get a copy of
shared memory handle which it neither uses and nor closes it.

I could easily reproduce the issue if logging collector is on and even if
we try to increase the loop count or sleep time in PGSharedMemoryCreate(),
it doesn't change the situation as the syslogger has a valid handle to
shared memory. One way to fix is to just close the shared memory handle
in sys logger as we are not going to need it and attached patch which does
this fixes the issue for me. Another invasive fix in case we want to
retain shared memory handle for some purpose (future requirement) could
be to send some signal to syslogger in restart path so that it can release
the shared memory handle.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_syslogger_dangling_shmhandle_v1.patchapplication/octet-stream; name=fix_syslogger_dangling_shmhandle_v1.patchDownload
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 34c7acf..dca5428 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -231,6 +231,17 @@ SysLoggerMain(int argc, char *argv[])
 #endif
 
 	/*
+	 * Close the shared memory handle as the syslogger doesn't need to
+	 * attach to it.  For EXEC_BACKEND case, the shared memory handle
+	 * is inherited by all postmaster child processes irrespective of
+	 * whether they need it or not.
+	 */
+#ifdef EXEC_BACKEND
+	if (!CloseHandle(UsedShmemSegID))
+		elog(LOG, "could not close handle to shared memory: error code %lu", GetLastError());
+#endif
+
+	/*
 	 * Properly accept or ignore signals the postmaster might send us
 	 *
 	 * Note: we ignore all termination signals, and instead exit only when all
#22Michael Paquier
michael.paquier@gmail.com
In reply to: Amit Kapila (#21)
Re: Postgres service stops when I kill client backend on Windows

On Mon, Oct 12, 2015 at 2:55 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I could easily reproduce the issue if logging collector is on and even if
we try to increase the loop count or sleep time in PGSharedMemoryCreate(),
it doesn't change the situation as the syslogger has a valid handle to
shared memory. One way to fix is to just close the shared memory handle
in sys logger as we are not going to need it and attached patch which does
this fixes the issue for me. Another invasive fix in case we want to
retain shared memory handle for some purpose (future requirement) could
be to send some signal to syslogger in restart path so that it can release
the shared memory handle.

+#ifdef EXEC_BACKEND
+    if (!CloseHandle(UsedShmemSegID))
+        elog(LOG, "could not close handle to shared memory: error
code %lu", GetLastError());
+#endif
I am pretty sure that you would want a WIN32 block here, not
EXEC_BACKEND as the latter can be used on non-Windows platforms as
well to emulate Windows behavior.
-- 
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#21)
Re: Postgres service stops when I kill client backend on Windows

On 2015-10-12 11:25:35 +0530, Amit Kapila wrote:

/*
+	 * Close the shared memory handle as the syslogger doesn't need to
+	 * attach to it.  For EXEC_BACKEND case, the shared memory handle
+	 * is inherited by all postmaster child processes irrespective of
+	 * whether they need it or not.
+	 */
+#ifdef EXEC_BACKEND
+	if (!CloseHandle(UsedShmemSegID))
+		elog(LOG, "could not close handle to shared memory: error code %lu", GetLastError());
+#endif
+

It feels wrong to do this in syslogger.c - I mean it's not the only
process that's not attached to shared memory. Sure, the others get
killed, but nonetheless...

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Magnus Hagander
magnus@hagander.net
In reply to: Andres Freund (#23)
Re: Postgres service stops when I kill client backend on Windows

On Mon, Oct 12, 2015 at 12:25 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-10-12 11:25:35 +0530, Amit Kapila wrote:

/*
+      * Close the shared memory handle as the syslogger doesn't need to
+      * attach to it.  For EXEC_BACKEND case, the shared memory handle
+      * is inherited by all postmaster child processes irrespective of
+      * whether they need it or not.
+      */
+#ifdef EXEC_BACKEND
+     if (!CloseHandle(UsedShmemSegID))
+             elog(LOG, "could not close handle to shared memory: error

code %lu", GetLastError());

+#endif
+

It feels wrong to do this in syslogger.c - I mean it's not the only
process that's not attached to shared memory. Sure, the others get
killed, but nonetheless...

+1. It feels like we're setting our selves up for repeating this mistake at
some later time :)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#22)
Re: Postgres service stops when I kill client backend on Windows

On Mon, Oct 12, 2015 at 3:45 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Mon, Oct 12, 2015 at 2:55 PM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I could easily reproduce the issue if logging collector is on and even

if

we try to increase the loop count or sleep time in

PGSharedMemoryCreate(),

it doesn't change the situation as the syslogger has a valid handle to
shared memory. One way to fix is to just close the shared memory handle
in sys logger as we are not going to need it and attached patch which

does

this fixes the issue for me. Another invasive fix in case we want to
retain shared memory handle for some purpose (future requirement) could
be to send some signal to syslogger in restart path so that it can

release

the shared memory handle.

+#ifdef EXEC_BACKEND
+    if (!CloseHandle(UsedShmemSegID))
+        elog(LOG, "could not close handle to shared memory: error
code %lu", GetLastError());
+#endif
I am pretty sure that you would want a WIN32 block here, not
EXEC_BACKEND as the latter can be used on non-Windows platforms as
well to emulate Windows behavior.

Agreed, I can change the patch to use WIN32, but it seems not all
people want to follow this approach. So lets first try to see what
is the best way to fix.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#26Michael Paquier
michael.paquier@gmail.com
In reply to: Magnus Hagander (#24)
1 attachment(s)
Re: Postgres service stops when I kill client backend on Windows

On Mon, Oct 12, 2015 at 7:26 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Mon, Oct 12, 2015 at 12:25 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-10-12 11:25:35 +0530, Amit Kapila wrote:

/*
+      * Close the shared memory handle as the syslogger doesn't need to
+      * attach to it.  For EXEC_BACKEND case, the shared memory handle
+      * is inherited by all postmaster child processes irrespective of
+      * whether they need it or not.
+      */
+#ifdef EXEC_BACKEND
+     if (!CloseHandle(UsedShmemSegID))
+             elog(LOG, "could not close handle to shared memory: error
code %lu", GetLastError());
+#endif
+

It feels wrong to do this in syslogger.c - I mean it's not the only
process that's not attached to shared memory. Sure, the others get
killed, but nonetheless...

+1. It feels like we're setting our selves up for repeating this mistake at
some later time :)

Actually, doesn't this apply as well to the archiver and the pgstat
collector? So perhaps we may want to do that in SubPostmasterMain with
PGSharedMemoryDetach. See for example the attached as an idea (patch
completely untested).
--
Michael

Attachments:

20151012_detach_shmem.patchapplication/x-patch; name=20151012_detach_shmem.patchDownload
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 24e8404..2076d96 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4637,6 +4637,16 @@ SubPostmasterMain(int argc, char *argv[])
 		strncmp(argv[1], "--forkbgworker=", 15) == 0)
 		PGSharedMemoryReAttach();
 
+	/*
+	 * Close any existing shared memory segment as those processes do not
+	 * need to have an access to it. This state is inherited from the
+	 * postmaster whether they need it or not.
+	 */
+	if (strcmp(argv[1], "--forkarch") == 0 ||
+		strcmp(argv[1], "--forkcol") == 0 ||
+		strcmp(argv[1], "--forklog") == 0)
+		PGSharedMemoryDetach();
+
 	/* autovacuum needs this set before calling InitProcess */
 	if (strcmp(argv[1], "--forkavlauncher") == 0)
 		AutovacuumLauncherIAm();
#27Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#26)
Re: Postgres service stops when I kill client backend on Windows

On 2015-10-12 21:38:12 +0900, Michael Paquier wrote:

It feels wrong to do this in syslogger.c - I mean it's not the only
process that's not attached to shared memory. Sure, the others get
killed, but nonetheless...

+1. It feels like we're setting our selves up for repeating this mistake at
some later time :)

Actually, doesn't this apply as well to the archiver and the pgstat
collector?

As mentioned above? The difference is that the archiver et al get killed
by postmaster during a PANIC restart thus don't present the problem
discussed here.

So perhaps we may want to do that in SubPostmasterMain with
PGSharedMemoryDetach. See for example the attached as an idea (patch
completely untested).

+	/*
+	 * Close any existing shared memory segment as those processes do not
+	 * need to have an access to it. This state is inherited from the
+	 * postmaster whether they need it or not.
+	 */
+	if (strcmp(argv[1], "--forkarch") == 0 ||
+		strcmp(argv[1], "--forkcol") == 0 ||
+		strcmp(argv[1], "--forklog") == 0)
+		PGSharedMemoryDetach();
+

Well, in those cases we won't have attached to shared memory, so I'm not
convinced that this is the right solution. In fact, won't this lead to
hitting the elog in
void
PGSharedMemoryDetach(void)
{
if (UsedShmemSegAddr != NULL)
{
if (!UnmapViewOfFile(UsedShmemSegAddr))
elog(LOG, "could not unmap view of shared memory: error code %lu", GetLastError());

UsedShmemSegAddr = NULL;
}
}
UsedShmemSegAddr will have been setup by read_backend_variables(), but
the process won't have anything mapped at this point?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Dmitry Vasilyev
d.vasilyev@postgrespro.ru
In reply to: Amit Kapila (#21)
Re: Postgres service stops when I kill client backend on Windows

Hello, Amit!
On Пн, 2015-10-12 at 11:25 +0530, Amit Kapila wrote:

On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane <tgl@sss.pgh.pa.us>

wrote:

I'm a bit suspicious that we may have leaked a handle to the

shared

memory block someplace, for example.  That would explain why

this

symptom is visible now when it was not in 2009.  Or maybe it's

dependent

on some feature that we didn't test back then --- for instance,

if

the logging collector is in use, could it have inherited a

handle and

not closed it?

Even if we leaked it, it should go away when the other processes

died.

I'm fairly certain that we do not kill/restart the logging

collector

during a database restart (because it's impossible to reproduce the
original stderr destination if we do).  

True and it seems this is the reason for issue we are discussing
here.
The reason why this happens is that during creation of shared memory
(PGSharedMemoryCreate()), we duplicate the handle such that it
become inheritable by all child processes.  Then during fork
(syslogger_forkexec()->postmaster_forkexec()->internal_forkexec) we
always inherit the handles which causes syslogger to get a copy of
shared memory handle which it neither uses and nor closes it.

I could easily reproduce the issue if logging collector is on and
even if
we try to increase the loop count or sleep time
in PGSharedMemoryCreate(),
it doesn't change the situation as the syslogger has a valid handle
to
shared memory.  One way to fix is to just close the shared memory
handle
in sys logger as we are not going to need it and attached patch which
does
this fixes the issue for me.  Another invasive fix in case we want to
retain shared memory handle for some purpose (future requirement)
could
be to send some signal to syslogger in restart path so that it can
release
the shared memory handle.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Specified patch with "ifdef WIN32" is working for me. Maybe it’s
necessary to check open handlers from replication for example?
--------------
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#29Oleg Bartunov
obartunov@gmail.com
In reply to: Dmitry Vasilyev (#28)
Re: Postgres service stops when I kill client backend on Windows

On Mon, Oct 12, 2015 at 4:42 PM, Dmitry Vasilyev <d.vasilyev@postgrespro.ru>
wrote:

Hello, Amit!

On Пн, 2015-10-12 at 11:25 +0530, Amit Kapila wrote:

On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm a bit suspicious that we may have leaked a handle to the shared
memory block someplace, for example. That would explain why this
symptom is visible now when it was not in 2009. Or maybe it's

dependent

on some feature that we didn't test back then --- for instance, if
the logging collector is in use, could it have inherited a handle and
not closed it?

Even if we leaked it, it should go away when the other processes died.

I'm fairly certain that we do not kill/restart the logging collector
during a database restart (because it's impossible to reproduce the
original stderr destination if we do).

True and it seems this is the reason for issue we are discussing here.
The reason why this happens is that during creation of shared memory
(PGSharedMemoryCreate()), we duplicate the handle such that it
become inheritable by all child processes. Then during fork
(syslogger_forkexec()->postmaster_forkexec()->internal_forkexec) we
always inherit the handles which causes syslogger to get a copy of
shared memory handle which it neither uses and nor closes it.

I could easily reproduce the issue if logging collector is on and even if
we try to increase the loop count or sleep time in PGSharedMemoryCreate(),
it doesn't change the situation as the syslogger has a valid handle to
shared memory. One way to fix is to just close the shared memory handle
in sys logger as we are not going to need it and attached patch which does
this fixes the issue for me. Another invasive fix in case we want to
retain shared memory handle for some purpose (future requirement) could
be to send some signal to syslogger in restart path so that it can release
the shared memory handle.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Specified patch with "ifdef WIN32" is working for me. Maybe it’s necessary
to check open handlers from replication for example?

Assuming the problem will be fixed, should we release Beta2 soon ?

Show quoted text

--------------
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#27)
Re: Postgres service stops when I kill client backend on Windows

Andres Freund <andres@anarazel.de> writes:

On 2015-10-12 21:38:12 +0900, Michael Paquier wrote:

Actually, doesn't this apply as well to the archiver and the pgstat
collector?

As mentioned above? The difference is that the archiver et al get killed
by postmaster during a PANIC restart thus don't present the problem
discussed here.

I thought your objection to the original patch was exactly that we should
not treat syslogger as a special case for this purpose.

Well, in those cases we won't have attached to shared memory, so I'm not
convinced that this is the right solution.

No, you're missing the point. In Windows builds, child processes inherit
a "handle" reference to the shared memory mapping, whether or not they
make any use of the handle to re-attach to that shared memory. The point
here is that we need to close that handle if we're not going to use it.

I think the right thing is something close to Michael's proposed patch,
though not duplicating and reversing the previous if-test like that.
In other words, something like this in SubPostmasterMain:

	/*
	 * If appropriate, physically re-attach to shared memory segment. We want
	 * to do this before going any further to ensure that we can attach at the
	 * same address the postmaster used.
+	 * If we're not re-attaching, close the inherited handle to avoid leaks.
	 */
	if (strcmp(argv[1], "--forkbackend") == 0 ||
		strcmp(argv[1], "--forkavlauncher") == 0 ||
		strcmp(argv[1], "--forkavworker") == 0 ||
		strcmp(argv[1], "--forkboot") == 0 ||
		strncmp(argv[1], "--forkbgworker=", 15) == 0)
		PGSharedMemoryReAttach();
+#ifdef WIN32
+	else
+		close the handle;
+#endif

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oleg Bartunov (#29)
Re: Postgres service stops when I kill client backend on Windows

Oleg Bartunov <obartunov@gmail.com> writes:

Assuming the problem will be fixed, should we release Beta2 soon ?

This bug has existed since we had native Windows support. It's entirely
immaterial for beta purposes, and I have a hard time thinking it's
critical enough to justify a short release cycle for the back branches
either.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#30)
Re: Postgres service stops when I kill client backend on Windows

On 2015-10-12 10:04:55 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2015-10-12 21:38:12 +0900, Michael Paquier wrote:

Actually, doesn't this apply as well to the archiver and the pgstat
collector?

As mentioned above? The difference is that the archiver et al get killed
by postmaster during a PANIC restart thus don't present the problem
discussed here.

I thought your objection to the original patch was exactly that we should
not treat syslogger as a special case for this purpose.

Yes. The above was just about this not being actively broken - I'd
mentioned the other processes before and to me it sounded like Michael
thought there might be an active problem.

Well, in those cases we won't have attached to shared memory, so I'm not
convinced that this is the right solution.

No, you're missing the point.

Don't think so.

In Windows builds, child processes inherit
a "handle" reference to the shared memory mapping, whether or not they
make any use of the handle to re-attach to that shared memory. The point
here is that we need to close that handle if we're not going to use it.

Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
without other changes as done in Michael's proposed patch? That'll do an
UnmapViewOfFile() which'll fail because nothing i mapped, but still not
close UsedShmemSegID?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#32)
Re: Postgres service stops when I kill client backend on Windows

Andres Freund <andres@anarazel.de> writes:

Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
without other changes as done in Michael's proposed patch? That'll do an
UnmapViewOfFile() which'll fail because nothing i mapped, but still not
close UsedShmemSegID?

Ah, right, I'd not noticed that he proposed changing
CloseHandle(UsedShmemSegID) to PGSharedMemoryDetach(). The latter is
clearly the wrong thing.

I'm not sure whether we should just put the CloseHandle call in
postmaster.c, or invent a function in win32_shmem.c to provide a
layer of abstraction.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#33)
Re: Postgres service stops when I kill client backend on Windows

I wrote:

Andres Freund <andres@anarazel.de> writes:

Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
without other changes as done in Michael's proposed patch? That'll do an
UnmapViewOfFile() which'll fail because nothing i mapped, but still not
close UsedShmemSegID?

Ah, right, I'd not noticed that he proposed changing
CloseHandle(UsedShmemSegID) to PGSharedMemoryDetach(). The latter is
clearly the wrong thing.

Actually, now that I look at it, it's even more obvious that this is the
wrong thing because *all the subprocess types in question already call
PGSharedMemoryDetach*. That's necessary on Unix, but I should think that
on Windows all it will do is provoke the log message:

elog(LOG, "could not unmap view of shared memory: error code %lu", GetLastError());

Could someone confirm whether syslogger, archiver, stats collector
processes reliably produce that log message at startup on Windows?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#34)
Re: Postgres service stops when I kill client backend on Windows

On Mon, Oct 12, 2015 at 8:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Andres Freund <andres@anarazel.de> writes:

Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
without other changes as done in Michael's proposed patch? That'll do

an

UnmapViewOfFile() which'll fail because nothing i mapped, but still not
close UsedShmemSegID?

Ah, right, I'd not noticed that he proposed changing
CloseHandle(UsedShmemSegID) to PGSharedMemoryDetach(). The latter is
clearly the wrong thing.

Actually, now that I look at it, it's even more obvious that this is the
wrong thing because *all the subprocess types in question already call
PGSharedMemoryDetach*. That's necessary on Unix, but I should think that
on Windows all it will do is provoke the log message:

elog(LOG, "could not unmap view of shared memory: error code

%lu", GetLastError());

Could someone confirm whether syslogger, archiver, stats collector
processes reliably produce that log message at startup on Windows?

I have tried this approach of calling PGSharedMemoryDetach() for
syslogger before calling closehandle() patch and I saw that message
and understood that it is not going to work.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#34)
Re: Postgres service stops when I kill client backend on Windows

I wrote:

Actually, now that I look at it, it's even more obvious that this is the
wrong thing because *all the subprocess types in question already call
PGSharedMemoryDetach*.

Ah, scratch that: in most of them, the call is in #ifndef EXEC_BACKEND
stanzas. The exception is bgworker start for a non-attached-to-shmem
worker, and in that case there's no log message because in fact
SubPostmasterMain did reattach.

This is kind of a mess :-(. But it does look like what we want is
for SubPostmasterMain to do more than nothing when it chooses not to
reattach. Probably that should include resetting UsedShmemSegAddr to
NULL, as well as closing the handle.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#36)
1 attachment(s)
Re: Postgres service stops when I kill client backend on Windows

I wrote:

This is kind of a mess :-(. But it does look like what we want is
for SubPostmasterMain to do more than nothing when it chooses not to
reattach. Probably that should include resetting UsedShmemSegAddr to
NULL, as well as closing the handle.

After poking around a bit more, I propose the attached patch. I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
able to test it on Windows ... would somebody do that?

BTW, it appears from this that Cygwin builds have been broken right along
in a different way: according to the code in sysv_shmem's
PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
were not undoing for putatively-not-connected-to-shmem child processes.
That's a robustness problem because it breaks the postmaster's expectation
that it's safe to not reinitialize shmem after a crash of one of those
processes. I believe this patch fixes that problem as well, though if
anyone can test it on Cygwin that wouldn't be a bad thing either.

regards, tom lane

Attachments:

windows-shmem-detach-fix.patchtext/x-diff; charset=us-ascii; name=windows-shmem-detach-fix.patchDownload
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 8be5bbe..c7a3a91 100644
*** a/src/backend/port/sysv_shmem.c
--- b/src/backend/port/sysv_shmem.c
*************** PGSharedMemoryReAttach(void)
*** 619,624 ****
--- 619,652 ----
  
  	UsedShmemSegAddr = hdr;		/* probably redundant */
  }
+ 
+ /*
+  * PGSharedMemoryNoReAttach
+  *
+  * Clean up if we choose *not* to re-attach to an already existing shared
+  * memory segment.  This is not used in the non EXEC_BACKEND case, either.
+  *
+  * UsedShmemSegID and UsedShmemSegAddr are implicit parameters to this
+  * routine.  The caller must have already restored them to the postmaster's
+  * values.
+  */
+ void
+ PGSharedMemoryNoReAttach(void)
+ {
+ 	Assert(UsedShmemSegAddr != NULL);
+ 	Assert(IsUnderPostmaster);
+ 
+ #ifdef __CYGWIN__
+ 	/* cygipc (currently) appears to not detach on exec. */
+ 	PGSharedMemoryDetach();
+ #endif
+ 
+ 	/* For cleanliness, reset UsedShmemSegAddr to show we're not attached. */
+ 	UsedShmemSegAddr = NULL;
+ 	/* And the same for UsedShmemSegID. */
+ 	UsedShmemSegID = 0;
+ }
+ 
  #endif   /* EXEC_BACKEND */
  
  /*
*************** PGSharedMemoryReAttach(void)
*** 629,634 ****
--- 657,665 ----
   * (it will have an on_shmem_exit callback registered to do that).  Rather,
   * this is for subprocesses that have inherited an attachment and want to
   * get rid of it.
+  *
+  * UsedShmemSegID and UsedShmemSegAddr are implicit parameters to this
+  * routine.
   */
  void
  PGSharedMemoryDetach(void)
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index db67627..8152522 100644
*** a/src/backend/port/win32_shmem.c
--- b/src/backend/port/win32_shmem.c
***************
*** 17,23 ****
  #include "storage/ipc.h"
  #include "storage/pg_shmem.h"
  
! HANDLE		UsedShmemSegID = 0;
  void	   *UsedShmemSegAddr = NULL;
  static Size UsedShmemSegSize = 0;
  
--- 17,23 ----
  #include "storage/ipc.h"
  #include "storage/pg_shmem.h"
  
! HANDLE		UsedShmemSegID = INVALID_HANDLE_VALUE;
  void	   *UsedShmemSegAddr = NULL;
  static Size UsedShmemSegSize = 0;
  
*************** PGSharedMemoryCreate(Size size, bool mak
*** 218,226 ****
  		elog(LOG, "could not close handle to shared memory: error code %lu", GetLastError());
  
  
- 	/* Register on-exit routine to delete the new segment */
- 	on_shmem_exit(pgwin32_SharedMemoryDelete, PointerGetDatum(hmap2));
- 
  	/*
  	 * Get a pointer to the new shared memory segment. Map the whole segment
  	 * at once, and let the system decide on the initial address.
--- 218,223 ----
*************** PGSharedMemoryCreate(Size size, bool mak
*** 254,259 ****
--- 251,259 ----
  	UsedShmemSegSize = size;
  	UsedShmemSegID = hmap2;
  
+ 	/* Register on-exit routine to delete the new segment */
+ 	on_shmem_exit(pgwin32_SharedMemoryDelete, PointerGetDatum(hmap2));
+ 
  	*shim = hdr;
  	return hdr;
  }
*************** PGSharedMemoryReAttach(void)
*** 299,321 ****
  }
  
  /*
   * PGSharedMemoryDetach
   *
   * Detach from the shared memory segment, if still attached.  This is not
!  * intended for use by the process that originally created the segment. Rather,
   * this is for subprocesses that have inherited an attachment and want to
   * get rid of it.
   */
  void
  PGSharedMemoryDetach(void)
  {
  	if (UsedShmemSegAddr != NULL)
  	{
  		if (!UnmapViewOfFile(UsedShmemSegAddr))
! 			elog(LOG, "could not unmap view of shared memory: error code %lu", GetLastError());
  
  		UsedShmemSegAddr = NULL;
  	}
  }
  
  
--- 299,368 ----
  }
  
  /*
+  * PGSharedMemoryNoReAttach
+  *
+  * Clean up if we choose *not* to re-attach to an already existing shared
+  * memory segment.
+  *
+  * UsedShmemSegID and UsedShmemSegAddr are implicit parameters to this
+  * routine.  The caller must have already restored them to the postmaster's
+  * values.
+  */
+ void
+ PGSharedMemoryNoReAttach(void)
+ {
+ 	Assert(UsedShmemSegAddr != NULL);
+ 	Assert(IsUnderPostmaster);
+ 
+ 	/*
+ 	 * Under Windows we will not have mapped the segment, so we don't need to
+ 	 * un-map it.  Just reset UsedShmemSegAddr to show we're not attached
+ 	 * (this is important in case somebody calls PGSharedMemoryDetach later).
+ 	 */
+ 	UsedShmemSegAddr = NULL;
+ 
+ 	/*
+ 	 * We *must* close the inherited shmem segment handle, else Windows will
+ 	 * consider the existence of this process to mean it can't release the
+ 	 * shmem segment yet.  We can now use PGSharedMemoryDetach to do that.
+ 	 */
+ 	PGSharedMemoryDetach();
+ }
+ 
+ /*
   * PGSharedMemoryDetach
   *
   * Detach from the shared memory segment, if still attached.  This is not
!  * intended for use by the process that originally created the segment
!  * (it will have an on_shmem_exit callback registered to do that).  Rather,
   * this is for subprocesses that have inherited an attachment and want to
   * get rid of it.
+  *
+  * UsedShmemSegID and UsedShmemSegAddr are implicit parameters to this
+  * routine.
   */
  void
  PGSharedMemoryDetach(void)
  {
+ 	/* Unmap the view, if it's mapped */
  	if (UsedShmemSegAddr != NULL)
  	{
  		if (!UnmapViewOfFile(UsedShmemSegAddr))
! 			elog(LOG, "could not unmap view of shared memory: error code %lu",
! 				 GetLastError());
  
  		UsedShmemSegAddr = NULL;
  	}
+ 
+ 	/* And close the shmem handle, if we have one */
+ 	if (UsedShmemSegID != INVALID_HANDLE_VALUE)
+ 	{
+ 		if (!CloseHandle(UsedShmemSegID))
+ 			elog(LOG, "could not close handle to shared memory: error code %lu",
+ 				 GetLastError());
+ 
+ 		UsedShmemSegID = INVALID_HANDLE_VALUE;
+ 	}
  }
  
  
*************** PGSharedMemoryDetach(void)
*** 326,334 ****
  static void
  pgwin32_SharedMemoryDelete(int status, Datum shmId)
  {
  	PGSharedMemoryDetach();
- 	if (!CloseHandle(DatumGetPointer(shmId)))
- 		elog(LOG, "could not close handle to shared memory: error code %lu", GetLastError());
  }
  
  /*
--- 373,380 ----
  static void
  pgwin32_SharedMemoryDelete(int status, Datum shmId)
  {
+ 	Assert(DatumGetPointer(shmId) == UsedShmemSegID);
  	PGSharedMemoryDetach();
  }
  
  /*
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 24e8404..90c2f4a 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*************** SubPostmasterMain(int argc, char *argv[]
*** 4628,4634 ****
  	/*
  	 * If appropriate, physically re-attach to shared memory segment. We want
  	 * to do this before going any further to ensure that we can attach at the
! 	 * same address the postmaster used.
  	 */
  	if (strcmp(argv[1], "--forkbackend") == 0 ||
  		strcmp(argv[1], "--forkavlauncher") == 0 ||
--- 4628,4635 ----
  	/*
  	 * If appropriate, physically re-attach to shared memory segment. We want
  	 * to do this before going any further to ensure that we can attach at the
! 	 * same address the postmaster used.  On the other hand, if we choose not
! 	 * to re-attach, we may have other cleanup to do.
  	 */
  	if (strcmp(argv[1], "--forkbackend") == 0 ||
  		strcmp(argv[1], "--forkavlauncher") == 0 ||
*************** SubPostmasterMain(int argc, char *argv[]
*** 4636,4641 ****
--- 4637,4644 ----
  		strcmp(argv[1], "--forkboot") == 0 ||
  		strncmp(argv[1], "--forkbgworker=", 15) == 0)
  		PGSharedMemoryReAttach();
+ 	else
+ 		PGSharedMemoryNoReAttach();
  
  	/* autovacuum needs this set before calling InitProcess */
  	if (strcmp(argv[1], "--forkavlauncher") == 0)
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 0b169af..9dbcbce 100644
*** a/src/include/storage/pg_shmem.h
--- b/src/include/storage/pg_shmem.h
*************** extern void *UsedShmemSegAddr;
*** 61,66 ****
--- 61,67 ----
  
  #ifdef EXEC_BACKEND
  extern void PGSharedMemoryReAttach(void);
+ extern void PGSharedMemoryNoReAttach(void);
  #endif
  
  extern PGShmemHeader *PGSharedMemoryCreate(Size size, bool makePrivate,
#38Dmitry Vasilyev
d.vasilyev@postgrespro.ru
In reply to: Tom Lane (#37)
Re: Postgres service stops when I kill client backend on Windows

Hello Tom!
On Пн, 2015-10-12 at 16:35 -0400, Tom Lane wrote:

I wrote:

This is kind of a mess :-(.  But it does look like what we want is
for SubPostmasterMain to do more than nothing when it chooses not
to
reattach.  Probably that should include resetting UsedShmemSegAddr
to
NULL, as well as closing the handle.

After poking around a bit more, I propose the attached patch.  I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm
not
able to test it on Windows ... would somebody do that?

BTW, it appears from this that Cygwin builds have been broken right
along
in a different way: according to the code in sysv_shmem's
PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which
we
were not undoing for putatively-not-connected-to-shmem child
processes.
That's a robustness problem because it breaks the postmaster's
expectation
that it's safe to not reinitialize shmem after a crash of one of
those
processes.  I believe this patch fixes that problem as well, though
if
anyone can test it on Cygwin that wouldn't be a bad thing either.

regards, tom lane

This patch is working for me,
binaries: https://goo.gl/32j7QE (MSVC 2010, build script here: https://github.com/postgrespro/pgwininstall).

------
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39Michael Paquier
michael.paquier@gmail.com
In reply to: Tom Lane (#37)
Re: Postgres service stops when I kill client backend on Windows

On Tue, Oct 13, 2015 at 5:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

This is kind of a mess :-(. But it does look like what we want is
for SubPostmasterMain to do more than nothing when it chooses not to
reattach. Probably that should include resetting UsedShmemSegAddr to
NULL, as well as closing the handle.

After poking around a bit more, I propose the attached patch. I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
able to test it on Windows ... would somebody do that?

BTW, it appears from this that Cygwin builds have been broken right along
in a different way: according to the code in sysv_shmem's
PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
were not undoing for putatively-not-connected-to-shmem child processes.
That's a robustness problem because it breaks the postmaster's expectation
that it's safe to not reinitialize shmem after a crash of one of those
processes. I believe this patch fixes that problem as well, though if
anyone can test it on Cygwin that wouldn't be a bad thing either.

I don't have a Cygwin environment at hand. That's unfortunate..

Looking at the patch, clearly +1 for the additional routine in both
win32_shmem.c and sysv_shmem.c to clean up the shmem state at backend
level. I have played as well with the patch on Windows and it behaves
as expected: without the patch a process killed with taskkill /f stops
straight the server even if restart_on_crash is on. With the patch the
server restarts correctly.

(Sorry, I should have mentioned that my last patch was untested and
*surely broken*, that was the result of a 3-min guess to make the
cleanup more generic for child processes that do not need to be
attached to shmem).
Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#37)
Re: Postgres service stops when I kill client backend on Windows

On 10/12/2015 04:35 PM, Tom Lane wrote:

I wrote:

This is kind of a mess :-(. But it does look like what we want is
for SubPostmasterMain to do more than nothing when it chooses not to
reattach. Probably that should include resetting UsedShmemSegAddr to
NULL, as well as closing the handle.

After poking around a bit more, I propose the attached patch. I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
able to test it on Windows ... would somebody do that?

BTW, it appears from this that Cygwin builds have been broken right along
in a different way: according to the code in sysv_shmem's
PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
were not undoing for putatively-not-connected-to-shmem child processes.
That's a robustness problem because it breaks the postmaster's expectation
that it's safe to not reinitialize shmem after a crash of one of those
processes. I believe this patch fixes that problem as well, though if
anyone can test it on Cygwin that wouldn't be a bad thing either.

OK, I can test it. But it's not quite clear to me from your description
how I should test Cygwin.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#40)
Re: Postgres service stops when I kill client backend on Windows

Andrew Dunstan <andrew@dunslane.net> writes:

On 10/12/2015 04:35 PM, Tom Lane wrote:

BTW, it appears from this that Cygwin builds have been broken right along
in a different way: according to the code in sysv_shmem's
PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
were not undoing for putatively-not-connected-to-shmem child processes.
That's a robustness problem because it breaks the postmaster's expectation
that it's safe to not reinitialize shmem after a crash of one of those
processes. I believe this patch fixes that problem as well, though if
anyone can test it on Cygwin that wouldn't be a bad thing either.

OK, I can test it. But it's not quite clear to me from your description
how I should test Cygwin.

The point is that I think that right now, the logging collector subprocess
remains connected to shared memory, which it should not (and won't, if my
patch is doing the right thing). I do not know if there's an easy way to
inspect the process state to verify that on Windows.

If nothing else, you could put a bogus access to some shared-memory data
structure into the syslogger loop, and check that it succeeds now and
crashes after applying the patch ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#39)
Re: Postgres service stops when I kill client backend on Windows

Michael Paquier <michael.paquier@gmail.com> writes:

On Tue, Oct 13, 2015 at 5:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

After poking around a bit more, I propose the attached patch. I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
able to test it on Windows ... would somebody do that?

Looking at the patch, clearly +1 for the additional routine in both
win32_shmem.c and sysv_shmem.c to clean up the shmem state at backend
level. I have played as well with the patch on Windows and it behaves
as expected: without the patch a process killed with taskkill /f stops
straight the server even if restart_on_crash is on. With the patch the
server restarts correctly.

OK, pushed with some additional comment-smithing.

I noticed while looking at this that for subprocesses that aren't supposed
to be attached to shared memory, we do pgwin32_ReserveSharedMemoryRegion()
anyway in internal_forkexec(), and then that's never undone anywhere,
so that that segment of the subprocess's memory space remains reserved.
I'm not sure if this is worth changing, but if it is, we could do so now
by calling VirtualFree() in PGSharedMemoryNoReAttach().

BTW, I am suspicious that the DSM stuff may have related issues --- do
we use inheritable mapping handles for DSM segments on Windows?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#42)
Re: Postgres service stops when I kill client backend on Windows

On Tue, Oct 13, 2015 at 8:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

On Tue, Oct 13, 2015 at 5:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

After poking around a bit more, I propose the attached patch. I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
able to test it on Windows ... would somebody do that?

Looking at the patch, clearly +1 for the additional routine in both
win32_shmem.c and sysv_shmem.c to clean up the shmem state at backend
level. I have played as well with the patch on Windows and it behaves
as expected: without the patch a process killed with taskkill /f stops
straight the server even if restart_on_crash is on. With the patch the
server restarts correctly.

OK, pushed with some additional comment-smithing.

I noticed while looking at this that for subprocesses that aren't supposed
to be attached to shared memory, we do pgwin32_ReserveSharedMemoryRegion()
anyway in internal_forkexec(), and then that's never undone anywhere,
so that that segment of the subprocess's memory space remains reserved.
I'm not sure if this is worth changing, but if it is, we could do so now
by calling VirtualFree() in PGSharedMemoryNoReAttach().

I think it is worth doing, as it can save the memory for processes which
don't attach to shared memory. Another thing is that we do allocate
handles (by using duplicate handle) in save_backend_variables() which
I am not sure are required for all the processes, anyway this doesn't
seem worth the trouble.

BTW, I am suspicious that the DSM stuff may have related issues --- do
we use inheritable mapping handles for DSM segments on Windows?

Not by default, there is an API dsm_pin_segment() which Duplicates the
handle for Postmaster process to retain the shared memory segment
till Postmaster shutdown. In general, I don't see such issues for DSM,
but please point me if you see anything problematic.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com