ERROR: too many dynamic shared memory segments

Started by Jakub Glapaabout 8 years ago17 messages
#1Jakub Glapa
jakub.glapa@gmail.com

Hi,
I started seeing those errors on Postgresql 10.1 running on Ubuntu 16.04.3
(64GB/8CPU) server.
The DB enters recovery mode after that.

2017-11-23 07:20:39 CET::@:[24823]: ERROR: could not attach to dynamic
shared area
2017-11-23 07:20:39 CET::@:[24822]: ERROR: could not map dynamic shared
memory segment
2017-11-23 07:20:40 CET:192.168.xx,xx(33974):u(at)db:[24209]: ERROR: too
many
dynamic shared memory segments

The errors happen when the parallel execution is enabled and multiple
queries are executed simultaneously.
If I set the max_parallel_workers_per_gather = 0 the error doesn't occur.
The queries are rather big, each executes over 10sec.
I see the error when 4 or more queries are started at the same time.

my postgresql.conf:

max_connections = 100
shared_buffers = 12GB
effective_cache_size = 51GB
work_mem = 83886kB
maintenance_work_mem = 2GB
max_wal_size = 8GB
min_wal_size = 4GB
checkpoint_completion_target = 0.9
wal_buffers = 32MB
default_statistics_target = 1000
max_locks_per_transaction = 128
#max_parallel_workers_per_gather = 0

PS. I've posted this question on pgsql-admin distro but didn't get any
feedback.

--
regards,
Jakub Glapa

#2Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Jakub Glapa (#1)
Re: ERROR: too many dynamic shared memory segments

On Mon, Nov 27, 2017 at 10:54 PM, Jakub Glapa <jakub.glapa@gmail.com> wrote:

The DB enters recovery mode after that.

That's not good. So it actually crashes? Can you please show the
full error messages?

2017-11-23 07:20:39 CET::@:[24823]: ERROR: could not attach to dynamic
shared area

From src/backend/utils/mmgr/dsa.c. The same message is used for
dsa_attach() and for attach_internal(), but in this case it must be
the latter because we use in-place DSA areas for parallel query. This
means that when the worker tried to attach it found that
control->refcnt == 0, meaning 'too late, every other backend including
the leader has already detached'.

2017-11-23 07:20:39 CET::@:[24822]: ERROR: could not map dynamic shared
memory segment

From src/backend/access/transam/parallel.c when dsm_attach returns
null. I think this also means 'too late'.

So those are both workers that have started up and found that the
leader has abandoned the parallel query already, but they discovered
it at different stages. PID 24823 didn't manage to attach to the DSM
segment, while PID 24822 did but found that the other(s) had already
detached from the per-query DSA area inside it.

2017-11-23 07:20:40 CET:192.168.xx,xx(33974):u(at)db:[24209]: ERROR: too
many
dynamic shared memory segments

The errors happen when the parallel execution is enabled and multiple
queries are executed simultaneously.
If I set the max_parallel_workers_per_gather = 0 the error doesn't occur.
The queries are rather big, each executes over 10sec.
I see the error when 4 or more queries are started at the same time.

my postgresql.conf:

max_connections = 100

So the two questions we need to figure out are: (1) how do we manage
to use up all 64 + 2 * 100 DSM slots (or somehow corrupt things so it
looks that way) by running only 4 queries, and (2) why would be be
crashing rather than failing and reporting an error? I'm not able to
reproduce the problem from your description running lots of parallel
queries running at the same time. Can you provide reproduction steps?
Does your query plan happen to include a Parallel Bitmap Heap Scan?

--
Thomas Munro
http://www.enterprisedb.com

#3Jakub Glapa
jakub.glapa@gmail.com
In reply to: Thomas Munro (#2)
Re: ERROR: too many dynamic shared memory segments

Hi Thomas,
log excerpt:

...
2017-11-27 12:21:14 CET:192.168.10.83(33424):user@db:[27291]: ERROR: too
many dynamic shared memory segments
2017-11-27 12:21:14 CET:192.168.10.83(33424):user@db:[27291]: STATEMENT:
SELECT << REMOVED>>
2017-11-27 12:21:14 CET:192.168.10.83(35182):user@db:[28281]: ERROR: too
many dynamic shared memory segments
2017-11-27 12:21:14 CET:192.168.10.83(35182):user@db:[28281]: STATEMENT:
SELECT <<REMOVED>
2017-11-27 12:21:14 CET::@:[28618]: ERROR: could not map dynamic shared
memory segment
2017-11-27 12:21:14 CET::@:[28619]: ERROR: could not map dynamic shared
memory segment
2017-11-27 12:21:14 CET::@:[25645]: LOG: worker process: parallel worker
for PID 27291 (PID 28618) exited with exit code 1
2017-11-27 12:21:14 CET::@:[25645]: LOG: worker process: parallel worker
for PID 27291 (PID 28619) exited with exit code 1
...

this time the db didn't crash but the queries failed to execute.

The queries are somehow special.
We are still using the old style partitioning (list type) but we abuse it a
bit when querying.
When querying a set of partitions instead of doing it via parent table we
stitch together the required tables with UNION ALL (this requires less
locking) and was more performant in our benchmark (the new native
partitioning might improve this but we didn't research that yet).

The queries are in form of
SELECT col1,col2,col3 FROM
(SELECT *
FROM par1
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'
UNION ALL SELECT *
FROM par2
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'
UNION ALL SELECT *
FROM par2
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'

WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'

UNION ALL
...
) unionalias
WHERE ....

and indeed the query planner shows the usage of Bitmap Heap Scan
....
-> Subquery Scan on "*SELECT* 2"
(cost=3068.58..19793.94 rows=1 width=1284)
-> Gather (cost=3068.58..19793.93
rows=1 width=5014)
Workers Planned: 2
-> Parallel Bitmap Heap Scan
on par_6 (cost=2068.58..18793.83 rows=1 width=5014)
Recheck Cond:
<<CONDITION>>
Filter: <<CONDITION>>
-> BitmapAnd
(cost=2068.58..2068.58 rows=30955 width=0)
-> BitmapOr
(cost=999.30..999.30 rows=42989 width=0)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..990.21 rows=42922 width=0)
Index
Cond: ((datasource)::text = 'one'::text)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..4.30 rows=1 width=0)
Index
Cond: ((datasource)::text = 'two'::text)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..4.79 rows=67 width=0)
Index
Cond: ((datasource)::text = 'three'::text)
-> Bitmap Index
Scan on par_6_rangestart (cost=0.00..1069.02 rows=47564 width=0)
Index Cond:
(rangestart >= '2017-01-01 00:00:00+01'::timestamp with time zone)
-> Subquery Scan on "*SELECT* 3"
(cost=761.33..7944.99 rows=1 width=1086)
-> Bitmap Heap Scan on par_7
(cost=761.33..7944.98 rows=1 width=4816)
Recheck Cond:
<<CONDITION>>
Filter: <<CONDITION>>
-> BitmapAnd
(cost=761.33..761.33 rows=7045 width=0)
-> Bitmap Index Scan on
par_7_rangestart (cost=0.00..380.35 rows=14942 width=0)
Index Cond:
(rangestart >= '2017-01-01 00:00:00+01'::timestamp with time zone)
-> BitmapOr
(cost=380.72..380.72 rows=12248 width=0)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..372.00 rows=12228 width=0)
Index Cond:
((datasource)::text = 'one'::text)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..4.36 rows=10 width=0)
Index Cond:
((datasource)::text = 'two'::text)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..4.36 rows=10 width=0)
Index Cond:
((datasource)::text = 'three'::text)

....

In this particular query there were over _100_ partitions connected with
the UNION ALL operator.

--
regards,
pozdrawiam,
Jakub Glapa

On Mon, Nov 27, 2017 at 11:47 AM, Thomas Munro <
thomas.munro@enterprisedb.com> wrote:

Show quoted text

On Mon, Nov 27, 2017 at 10:54 PM, Jakub Glapa <jakub.glapa@gmail.com>
wrote:

The DB enters recovery mode after that.

That's not good. So it actually crashes? Can you please show the
full error messages?

2017-11-23 07:20:39 CET::@:[24823]: ERROR: could not attach to dynamic
shared area

From src/backend/utils/mmgr/dsa.c. The same message is used for
dsa_attach() and for attach_internal(), but in this case it must be
the latter because we use in-place DSA areas for parallel query. This
means that when the worker tried to attach it found that
control->refcnt == 0, meaning 'too late, every other backend including
the leader has already detached'.

2017-11-23 07:20:39 CET::@:[24822]: ERROR: could not map dynamic shared
memory segment

From src/backend/access/transam/parallel.c when dsm_attach returns
null. I think this also means 'too late'.

So those are both workers that have started up and found that the
leader has abandoned the parallel query already, but they discovered
it at different stages. PID 24823 didn't manage to attach to the DSM
segment, while PID 24822 did but found that the other(s) had already
detached from the per-query DSA area inside it.

2017-11-23 07:20:40 CET:192.168.xx,xx(33974):u(at)db:[24209]: ERROR:

too

many
dynamic shared memory segments

The errors happen when the parallel execution is enabled and multiple
queries are executed simultaneously.
If I set the max_parallel_workers_per_gather = 0 the error doesn't

occur.

The queries are rather big, each executes over 10sec.
I see the error when 4 or more queries are started at the same time.

my postgresql.conf:

max_connections = 100

So the two questions we need to figure out are: (1) how do we manage
to use up all 64 + 2 * 100 DSM slots (or somehow corrupt things so it
looks that way) by running only 4 queries, and (2) why would be be
crashing rather than failing and reporting an error? I'm not able to
reproduce the problem from your description running lots of parallel
queries running at the same time. Can you provide reproduction steps?
Does your query plan happen to include a Parallel Bitmap Heap Scan?

--
Thomas Munro
http://www.enterprisedb.com

#4Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Jakub Glapa (#3)
Re: ERROR: too many dynamic shared memory segments

On Tue, Nov 28, 2017 at 1:13 AM, Jakub Glapa <jakub.glapa@gmail.com> wrote:

The queries are somehow special.
We are still using the old style partitioning (list type) but we abuse it a
bit when querying.
When querying a set of partitions instead of doing it via parent table we
stitch together the required tables with UNION ALL (this requires less
locking) and was more performant in our benchmark (the new native
partitioning might improve this but we didn't research that yet).

The queries are in form of
SELECT col1,col2,col3 FROM
(SELECT *
FROM par1
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'
UNION ALL SELECT *
FROM par2
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'
UNION ALL SELECT *
FROM par2
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'

WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'

UNION ALL
...
) unionalias
WHERE ....

and indeed the query planner shows the usage of Bitmap Heap Scan
....
-> Subquery Scan on "*SELECT* 2"
(cost=3068.58..19793.94 rows=1 width=1284)
-> Gather (cost=3068.58..19793.93
rows=1 width=5014)
Workers Planned: 2
-> Parallel Bitmap Heap Scan
on par_6 (cost=2068.58..18793.83 rows=1 width=5014)
Recheck Cond:
<<CONDITION>>
Filter: <<CONDITION>>
-> BitmapAnd
(cost=2068.58..2068.58 rows=30955 width=0)
-> BitmapOr
(cost=999.30..999.30 rows=42989 width=0)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..990.21 rows=42922 width=0)
Index
Cond: ((datasource)::text = 'one'::text)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..4.30 rows=1 width=0)
Index
Cond: ((datasource)::text = 'two'::text)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..4.79 rows=67 width=0)
Index
Cond: ((datasource)::text = 'three'::text)
-> Bitmap Index
Scan on par_6_rangestart (cost=0.00..1069.02 rows=47564 width=0)
Index Cond:
(rangestart >= '2017-01-01 00:00:00+01'::timestamp with time zone)
-> Subquery Scan on "*SELECT* 3"
(cost=761.33..7944.99 rows=1 width=1086)
-> Bitmap Heap Scan on par_7
(cost=761.33..7944.98 rows=1 width=4816)
Recheck Cond:
<<CONDITION>>
Filter: <<CONDITION>>
-> BitmapAnd
(cost=761.33..761.33 rows=7045 width=0)
-> Bitmap Index Scan on
par_7_rangestart (cost=0.00..380.35 rows=14942 width=0)
Index Cond:
(rangestart >= '2017-01-01 00:00:00+01'::timestamp with time zone)
-> BitmapOr
(cost=380.72..380.72 rows=12248 width=0)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..372.00 rows=12228 width=0)
Index Cond:
((datasource)::text = 'one'::text)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..4.36 rows=10 width=0)
Index Cond:
((datasource)::text = 'two'::text)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..4.36 rows=10 width=0)
Index Cond:
((datasource)::text = 'three'::text)

....

In this particular query there were over _100_ partitions connected with the
UNION ALL operator.

Ah, so you have many Gather nodes under Append? That's one way to eat
arbitrarily many DSM slots. We allow for 64 + 2 * max_backends. Does
it help if you increase max_connections? I am concerned about the
crash failure mode you mentioned in the first email though: we should
always be able to handle that condition gracefully.

--
Thomas Munro
http://www.enterprisedb.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#4)
Re: ERROR: too many dynamic shared memory segments

Thomas Munro <thomas.munro@enterprisedb.com> writes:

Ah, so you have many Gather nodes under Append? That's one way to eat
arbitrarily many DSM slots. We allow for 64 + 2 * max_backends. Does
it help if you increase max_connections? I am concerned about the
crash failure mode you mentioned in the first email though: we should
always be able to handle that condition gracefully.

I suspect this is an instance of the issue I complained about before [1]/messages/by-id/4905.1492813727@sss.pgh.pa.us
that parallel query is unable to cope with worker start failure.

regards, tom lane

[1]: /messages/by-id/4905.1492813727@sss.pgh.pa.us

#6Jakub Glapa
jakub.glapa@gmail.com
In reply to: Thomas Munro (#4)
Re: ERROR: too many dynamic shared memory segments

Hi Thomas,
doubling the max_connection has the problem gone away for now! Yay!

As for the crash. I dug up the initial log and it looks like a segmentation
fault...

2017-11-23 07:26:53 CET:192.168.10.83(35238):user@db:[30003]: ERROR: too
many dynamic shared memory segments
2017-11-23 07:26:53 CET:192.168.10.83(35238):user@db:[30003]: STATEMENT:
SELECT ....
2017-11-23 07:26:55 CET::@:[29398]: LOG: server process (PID 26992) was
terminated by signal 11: Segmentation fault
2017-11-23 07:26:55 CET::@:[29398]: DETAIL: Failed process was running:
SELECT .....
2017-11-23 07:26:55 CET::@:[29398]: LOG: terminating any other active
server processes
2017-11-23 07:26:55 CET:192.168.10.83(35242):user@db:[30009]: WARNING:
terminating connection because of crash of another server process
2017-11-23 07:26:55 CET:192.168.10.83(35242):user@db:[30009]: DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.
2017-11-23 07:26:55 CET:192.168.10.83(35242):user@db:[30009]: HINT: In a
moment you should be able to reconnect to the database and repeat your
command.
2017-11-23 07:26:55 CET:192.168.10.83(35268):user@db:[30097]: WARNING:
terminating connection because of crash of another server process
2017-11-23 07:26:55 CET:192.168.10.83(35268):user@db:[30097]: DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.
2017-11-23 07:26:55 CET:192.168.10.83(35268):user@db:[30097]: HINT: In a
moment you should be able to reconnect to the database and repeat your
command.
2017-11-23 07:26:55 CET:192.168.10.83(46614):user@db:[31080]: WARNING:
terminating connection because of crash of another server process
2017-11-23 07:26:55 CET:192.168.10.83(46614):user@db:[31080]: DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.
2017-11-23 07:26:55 CET:192.168.10.83(46614):user@db: HINT: In a moment
you should be able to reconnect to the database and repeat your command.
2017-11-23 07:26:55 CET:192.168.10.83(35238):user@db:[30003]: WARNING:
terminating connection because of crash of another server process
2017-11-23 07:26:55 CET:192.168.10.83(35238):user@db:[30003]: DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory
2017-11-23 07:26:55 CET:192.168.10.83(35300):user@db:[30152]: FATAL: the
database system is in recovery mode

--
regards,
pozdrawiam,
Jakub Glapa

On Mon, Nov 27, 2017 at 7:53 PM, Thomas Munro <thomas.munro@enterprisedb.com

Show quoted text

wrote:

On Tue, Nov 28, 2017 at 1:13 AM, Jakub Glapa <jakub.glapa@gmail.com>
wrote:

The queries are somehow special.
We are still using the old style partitioning (list type) but we abuse

it a

bit when querying.
When querying a set of partitions instead of doing it via parent table we
stitch together the required tables with UNION ALL (this requires less
locking) and was more performant in our benchmark (the new native
partitioning might improve this but we didn't research that yet).

The queries are in form of
SELECT col1,col2,col3 FROM
(SELECT *
FROM par1
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'
UNION ALL SELECT *
FROM par2
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend

<=

'2017-11-26 23:59:59.999'
UNION ALL SELECT *
FROM par2
WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend

<=

'2017-11-26 23:59:59.999'

WHERE rangestart >= '2017-01-01 00:00:00.0' AND rangeend <=
'2017-11-26 23:59:59.999'

UNION ALL
...
) unionalias
WHERE ....

and indeed the query planner shows the usage of Bitmap Heap Scan
....
-> Subquery Scan on "*SELECT* 2"
(cost=3068.58..19793.94 rows=1 width=1284)
-> Gather

(cost=3068.58..19793.93

rows=1 width=5014)
Workers Planned: 2
-> Parallel Bitmap Heap

Scan

on par_6 (cost=2068.58..18793.83 rows=1 width=5014)
Recheck Cond:
<<CONDITION>>
Filter: <<CONDITION>>
-> BitmapAnd
(cost=2068.58..2068.58 rows=30955 width=0)
-> BitmapOr
(cost=999.30..999.30 rows=42989 width=0)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..990.21 rows=42922 width=0)

Index

Cond: ((datasource)::text = 'one'::text)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..4.30 rows=1 width=0)

Index

Cond: ((datasource)::text = 'two'::text)
-> Bitmap
Index Scan on par_6_datasource (cost=0.00..4.79 rows=67 width=0)

Index

Cond: ((datasource)::text = 'three'::text)
-> Bitmap Index
Scan on par_6_rangestart (cost=0.00..1069.02 rows=47564 width=0)
Index

Cond:

(rangestart >= '2017-01-01 00:00:00+01'::timestamp with time zone)
-> Subquery Scan on "*SELECT* 3"
(cost=761.33..7944.99 rows=1 width=1086)
-> Bitmap Heap Scan on par_7
(cost=761.33..7944.98 rows=1 width=4816)
Recheck Cond:
<<CONDITION>>
Filter: <<CONDITION>>
-> BitmapAnd
(cost=761.33..761.33 rows=7045 width=0)
-> Bitmap Index Scan

on

par_7_rangestart (cost=0.00..380.35 rows=14942 width=0)
Index Cond:
(rangestart >= '2017-01-01 00:00:00+01'::timestamp with time zone)
-> BitmapOr
(cost=380.72..380.72 rows=12248 width=0)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..372.00 rows=12228 width=0)
Index

Cond:

((datasource)::text = 'one'::text)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..4.36 rows=10 width=0)
Index

Cond:

((datasource)::text = 'two'::text)
-> Bitmap Index
Scan on par_7_datasource (cost=0.00..4.36 rows=10 width=0)
Index

Cond:

((datasource)::text = 'three'::text)

....

In this particular query there were over _100_ partitions connected with

the

UNION ALL operator.

Ah, so you have many Gather nodes under Append? That's one way to eat
arbitrarily many DSM slots. We allow for 64 + 2 * max_backends. Does
it help if you increase max_connections? I am concerned about the
crash failure mode you mentioned in the first email though: we should
always be able to handle that condition gracefully.

--
Thomas Munro
http://www.enterprisedb.com

#7Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Jakub Glapa (#6)
Re: ERROR: too many dynamic shared memory segments

On Tue, Nov 28, 2017 at 10:05 AM, Jakub Glapa <jakub.glapa@gmail.com> wrote:

As for the crash. I dug up the initial log and it looks like a segmentation
fault...

2017-11-23 07:26:53 CET:192.168.10.83(35238):user@db:[30003]: ERROR: too
many dynamic shared memory segments

Hmm. Well this error can only occur in dsm_create() called without
DSM_CREATE_NULL_IF_MAXSEGMENTS. parallel.c calls it with that flag
and dsa.c doesn't (perhaps it should, not sure, but that'd just change
the error message), so that means this the error arose from dsa.c
trying to get more segments. That would be when Parallel Bitmap Heap
Scan tried to allocate memory.

I hacked my copy of PostgreSQL so that it allows only 5 DSM slots and
managed to reproduce a segv crash by trying to run concurrent Parallel
Bitmap Heap Scans. The stack looks like this:

* frame #0: 0x00000001083ace29
postgres`alloc_object(area=0x0000000000000000, size_class=10) + 25 at
dsa.c:1433
frame #1: 0x00000001083acd14
postgres`dsa_allocate_extended(area=0x0000000000000000, size=72,
flags=4) + 1076 at dsa.c:785
frame #2: 0x0000000108059c33
postgres`tbm_prepare_shared_iterate(tbm=0x00007f9743027660) + 67 at
tidbitmap.c:780
frame #3: 0x0000000108000d57
postgres`BitmapHeapNext(node=0x00007f9743019c88) + 503 at
nodeBitmapHeapscan.c:156
frame #4: 0x0000000107fefc5b
postgres`ExecScanFetch(node=0x00007f9743019c88,
accessMtd=(postgres`BitmapHeapNext at nodeBitmapHeapscan.c:77),
recheckMtd=(postgres`BitmapHeapRecheck at nodeBitmapHeapscan.c:710)) +
459 at execScan.c:95
frame #5: 0x0000000107fef983
postgres`ExecScan(node=0x00007f9743019c88,
accessMtd=(postgres`BitmapHeapNext at nodeBitmapHeapscan.c:77),
recheckMtd=(postgres`BitmapHeapRecheck at nodeBitmapHeapscan.c:710)) +
147 at execScan.c:162
frame #6: 0x00000001080008d1
postgres`ExecBitmapHeapScan(pstate=0x00007f9743019c88) + 49 at
nodeBitmapHeapscan.c:735

(lldb) f 3
frame #3: 0x0000000108000d57
postgres`BitmapHeapNext(node=0x00007f9743019c88) + 503 at
nodeBitmapHeapscan.c:156
153 * dsa_pointer of the iterator state which will be used by
154 * multiple processes to iterate jointly.
155 */
-> 156 pstate->tbmiterator = tbm_prepare_shared_iterate(tbm);
157 #ifdef USE_PREFETCH
158 if (node->prefetch_maximum > 0)
159
(lldb) print tbm->dsa
(dsa_area *) $3 = 0x0000000000000000
(lldb) print node->ss.ps.state->es_query_dsa
(dsa_area *) $5 = 0x0000000000000000
(lldb) f 17
frame #17: 0x000000010800363b
postgres`ExecGather(pstate=0x00007f9743019320) + 635 at
nodeGather.c:220
217 * Get next tuple, either from one of our workers, or by running the plan
218 * ourselves.
219 */
-> 220 slot = gather_getnext(node);
221 if (TupIsNull(slot))
222 return NULL;
223
(lldb) print *node->pei
(ParallelExecutorInfo) $8 = {
planstate = 0x00007f9743019640
pcxt = 0x00007f97450001b8
buffer_usage = 0x0000000108b7e218
instrumentation = 0x0000000108b7da38
area = 0x0000000000000000
param_exec = 0
finished = '\0'
tqueue = 0x0000000000000000
reader = 0x0000000000000000
}
(lldb) print *node->pei->pcxt
warning: could not load any Objective-C class information. This will
significantly reduce the quality of type information available.
(ParallelContext) $9 = {
node = {
prev = 0x000000010855fb60
next = 0x000000010855fb60
}
subid = 1
nworkers = 0
nworkers_launched = 0
library_name = 0x00007f9745000248 "postgres"
function_name = 0x00007f9745000268 "ParallelQueryMain"
error_context_stack = 0x0000000000000000
estimator = (space_for_chunks = 180352, number_of_keys = 19)
seg = 0x0000000000000000
private_memory = 0x0000000108b53038
toc = 0x0000000108b53038
worker = 0x0000000000000000
}

I think there are two failure modes: one of your sessions showed the
"too many ..." error (that's good, ran out of slots and said so and
our error machinery worked as it should), and another crashed with a
segfault, because it tried to use a NULL "area" pointer (bad). I
think this is a degenerate case where we completely failed to launch
parallel query, but we ran the parallel query plan anyway and this
code thinks that the DSA is available. Oops.

--
Thomas Munro
http://www.enterprisedb.com

#8Dilip Kumar
dilipbalaut@gmail.com
In reply to: Thomas Munro (#7)
1 attachment(s)
Re: ERROR: too many dynamic shared memory segments

On Tue, Nov 28, 2017 at 4:18 AM, Thomas Munro <thomas.munro@enterprisedb.com

wrote:

On Tue, Nov 28, 2017 at 10:05 AM, Jakub Glapa <jakub.glapa@gmail.com>
wrote:

As for the crash. I dug up the initial log and it looks like a

segmentation

fault...

2017-11-23 07:26:53 CET:192.168.10.83(35238):user@db:[30003]: ERROR:

too

many dynamic shared memory segments

I think there are two failure modes: one of your sessions showed the
"too many ..." error (that's good, ran out of slots and said so and
our error machinery worked as it should), and another crashed with a
segfault, because it tried to use a NULL "area" pointer (bad). I
think this is a degenerate case where we completely failed to launch
parallel query, but we ran the parallel query plan anyway and this
code thinks that the DSA is available. Oops.

I think BitmapHeapScan check whether dsa is valid or not if DSA is not
valid then it should assume it's non-parallel plan.

Attached patch should fix the issue.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

bug_fix_in_pbhs_when_dsa_not_initialized.patchapplication/octet-stream; name=bug_fix_in_pbhs_when_dsa_not_initialized.patchDownload
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index b885f2a..b03e44e 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -83,9 +83,13 @@ BitmapHeapNext(BitmapHeapScanState *node)
 	TBMIterateResult *tbmres;
 	OffsetNumber targoffset;
 	TupleTableSlot *slot;
-	ParallelBitmapHeapState *pstate = node->pstate;
+	ParallelBitmapHeapState *pstate = NULL;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
 
+	/* if DSA is NULL then go for non-parallel mode */
+	if (dsa != NULL)
+		pstate = node->pstate;
+
 	/*
 	 * extract necessary information from index scan node
 	 */
#9Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#8)
1 attachment(s)
Re: ERROR: too many dynamic shared memory segments

On Tue, Nov 28, 2017 at 2:32 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

I think BitmapHeapScan check whether dsa is valid or not if DSA is not
valid then it should assume it's non-parallel plan.

Attached patch should fix the issue.

So, create the pstate and then pretend we didn't? Why not just avoid
creating it in the first place, like this?

I haven't checked whether this fixes the bug, but if it does, we can
avoid introducing an extra branch in BitmapHeapNext.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

no-pstate.patchapplication/octet-stream; name=no-pstate.patchDownload
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 221391908c..eb5bbb57ef 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -1051,6 +1051,11 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node,
 {
 	ParallelBitmapHeapState *pstate;
 	EState	   *estate = node->ss.ps.state;
+	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
+
+	/* If there's no DSA, there are no workers; initialize nothing. */
+	if (dsa == NULL)
+		return;
 
 	pstate = shm_toc_allocate(pcxt->toc, node->pscan_len);
 
@@ -1083,6 +1088,10 @@ ExecBitmapHeapReInitializeDSM(BitmapHeapScanState *node,
 	ParallelBitmapHeapState *pstate = node->pstate;
 	dsa_area   *dsa = node->ss.ps.state->es_query_dsa;
 
+	/* If there's no DSA, there are no workers; do nothing. */
+	if (dsa == NULL)
+		return;
+
 	pstate->state = BM_INITIAL;
 
 	if (DsaPointerIsValid(pstate->tbmiterator))
@@ -1108,6 +1117,8 @@ ExecBitmapHeapInitializeWorker(BitmapHeapScanState *node,
 	ParallelBitmapHeapState *pstate;
 	Snapshot	snapshot;
 
+	Assert(node->ss.ps.state->es_query_dsa != NULL);
+
 	pstate = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
 	node->pstate = pstate;
 
#10Dilip Kumar
dilipbalaut@gmail.com
In reply to: Robert Haas (#9)
Re: ERROR: too many dynamic shared memory segments

On Tue, Nov 28, 2017 at 7:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Nov 28, 2017 at 2:32 AM, Dilip Kumar <dilipbalaut@gmail.com>
wrote:

I think BitmapHeapScan check whether dsa is valid or not if DSA is not
valid then it should assume it's non-parallel plan.

Attached patch should fix the issue.

So, create the pstate and then pretend we didn't? Why not just avoid
creating it in the first place, like this?

This is better way to fix it.

I haven't checked whether this fixes the bug, but if it does, we can
avoid introducing an extra branch in BitmapHeapNext.

With my test it's fixing the problem.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#11Robert Haas
robertmhaas@gmail.com
In reply to: Dilip Kumar (#10)
Re: ERROR: too many dynamic shared memory segments

On Tue, Nov 28, 2017 at 9:45 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

I haven't checked whether this fixes the bug, but if it does, we can
avoid introducing an extra branch in BitmapHeapNext.

With my test it's fixing the problem.

I tested it some more and found that, for me, it PARTIALLY fixes the
problem. I tested like this:

--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -279,7 +279,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
      * parallelism than to fail outright.
      */
     segsize = shm_toc_estimate(&pcxt->estimator);
-    if (pcxt->nworkers > 0)
+    if (pcxt->nworkers > 0 && false)
         pcxt->seg = dsm_create(segsize, DSM_CREATE_NULL_IF_MAXSEGMENTS);
     if (pcxt->seg != NULL)
         pcxt->toc = shm_toc_create(PARALLEL_MAGIC,

That turned out to produce more than one problem. I find that the
select_parallel test then fails like this:

ERROR: could not find key 18446744073709486082 in shm TOC at 0x10be98040

The fix for that problem seems to be:

--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -430,7 +430,8 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
     /* Recreate error queues. */
     error_queue_space =
-        shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, false);
+        shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, true);
+    Assert(pcxt->nworkers == 0 || error_queue_space != NULL);
     for (i = 0; i < pcxt->nworkers; ++i)
     {
         char       *start;

With that fix in place, I then hit a crash in parallel bitmap heap
scan. After applying no-pstate.patch, which I just committed and
back-patched to v10, then things look OK. I'm going to apply the fix
for the error_queue_space problem also.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Jakub Glapa
jakub.glapa@gmail.com
In reply to: Thomas Munro (#7)
Re: ERROR: too many dynamic shared memory segments

I see that the segfault is under active discussion but just wanted to ask
if increasing the max_connections to mitigate the DSM slots shortage is the
way to go?

--
regards,
Jakub Glapa

On Mon, Nov 27, 2017 at 11:48 PM, Thomas Munro <
thomas.munro@enterprisedb.com> wrote:

Show quoted text

On Tue, Nov 28, 2017 at 10:05 AM, Jakub Glapa <jakub.glapa@gmail.com>
wrote:

As for the crash. I dug up the initial log and it looks like a

segmentation

fault...

2017-11-23 07:26:53 CET:192.168.10.83(35238):user@db:[30003]: ERROR:

too

many dynamic shared memory segments

Hmm. Well this error can only occur in dsm_create() called without
DSM_CREATE_NULL_IF_MAXSEGMENTS. parallel.c calls it with that flag
and dsa.c doesn't (perhaps it should, not sure, but that'd just change
the error message), so that means this the error arose from dsa.c
trying to get more segments. That would be when Parallel Bitmap Heap
Scan tried to allocate memory.

I hacked my copy of PostgreSQL so that it allows only 5 DSM slots and
managed to reproduce a segv crash by trying to run concurrent Parallel
Bitmap Heap Scans. The stack looks like this:

* frame #0: 0x00000001083ace29
postgres`alloc_object(area=0x0000000000000000, size_class=10) + 25 at
dsa.c:1433
frame #1: 0x00000001083acd14
postgres`dsa_allocate_extended(area=0x0000000000000000, size=72,
flags=4) + 1076 at dsa.c:785
frame #2: 0x0000000108059c33
postgres`tbm_prepare_shared_iterate(tbm=0x00007f9743027660) + 67 at
tidbitmap.c:780
frame #3: 0x0000000108000d57
postgres`BitmapHeapNext(node=0x00007f9743019c88) + 503 at
nodeBitmapHeapscan.c:156
frame #4: 0x0000000107fefc5b
postgres`ExecScanFetch(node=0x00007f9743019c88,
accessMtd=(postgres`BitmapHeapNext at nodeBitmapHeapscan.c:77),
recheckMtd=(postgres`BitmapHeapRecheck at nodeBitmapHeapscan.c:710)) +
459 at execScan.c:95
frame #5: 0x0000000107fef983
postgres`ExecScan(node=0x00007f9743019c88,
accessMtd=(postgres`BitmapHeapNext at nodeBitmapHeapscan.c:77),
recheckMtd=(postgres`BitmapHeapRecheck at nodeBitmapHeapscan.c:710)) +
147 at execScan.c:162
frame #6: 0x00000001080008d1
postgres`ExecBitmapHeapScan(pstate=0x00007f9743019c88) + 49 at
nodeBitmapHeapscan.c:735

(lldb) f 3
frame #3: 0x0000000108000d57
postgres`BitmapHeapNext(node=0x00007f9743019c88) + 503 at
nodeBitmapHeapscan.c:156
153 * dsa_pointer of the iterator state which will be used by
154 * multiple processes to iterate jointly.
155 */
-> 156 pstate->tbmiterator = tbm_prepare_shared_iterate(tbm);
157 #ifdef USE_PREFETCH
158 if (node->prefetch_maximum > 0)
159
(lldb) print tbm->dsa
(dsa_area *) $3 = 0x0000000000000000
(lldb) print node->ss.ps.state->es_query_dsa
(dsa_area *) $5 = 0x0000000000000000
(lldb) f 17
frame #17: 0x000000010800363b
postgres`ExecGather(pstate=0x00007f9743019320) + 635 at
nodeGather.c:220
217 * Get next tuple, either from one of our workers, or by running the
plan
218 * ourselves.
219 */
-> 220 slot = gather_getnext(node);
221 if (TupIsNull(slot))
222 return NULL;
223
(lldb) print *node->pei
(ParallelExecutorInfo) $8 = {
planstate = 0x00007f9743019640
pcxt = 0x00007f97450001b8
buffer_usage = 0x0000000108b7e218
instrumentation = 0x0000000108b7da38
area = 0x0000000000000000
param_exec = 0
finished = '\0'
tqueue = 0x0000000000000000
reader = 0x0000000000000000
}
(lldb) print *node->pei->pcxt
warning: could not load any Objective-C class information. This will
significantly reduce the quality of type information available.
(ParallelContext) $9 = {
node = {
prev = 0x000000010855fb60
next = 0x000000010855fb60
}
subid = 1
nworkers = 0
nworkers_launched = 0
library_name = 0x00007f9745000248 "postgres"
function_name = 0x00007f9745000268 "ParallelQueryMain"
error_context_stack = 0x0000000000000000
estimator = (space_for_chunks = 180352, number_of_keys = 19)
seg = 0x0000000000000000
private_memory = 0x0000000108b53038
toc = 0x0000000108b53038
worker = 0x0000000000000000
}

I think there are two failure modes: one of your sessions showed the
"too many ..." error (that's good, ran out of slots and said so and
our error machinery worked as it should), and another crashed with a
segfault, because it tried to use a NULL "area" pointer (bad). I
think this is a degenerate case where we completely failed to launch
parallel query, but we ran the parallel query plan anyway and this
code thinks that the DSA is available. Oops.

--
Thomas Munro
http://www.enterprisedb.com

#13Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Jakub Glapa (#12)
Re: ERROR: too many dynamic shared memory segments

On Tue, Dec 5, 2017 at 1:18 AM, Jakub Glapa <jakub.glapa@gmail.com> wrote:

I see that the segfault is under active discussion but just wanted to ask if
increasing the max_connections to mitigate the DSM slots shortage is the way
to go?

Hi Jakub,

Yes. In future releases this situation will improve (maybe we'll
figure out how to use one DSM segment for all the gather nodes in your
query plan, and maybe it'll be moot anyway because maybe we'll be able
to use a Parallel Append for queries like yours so that it uses the
same set of workers over all the child plans instead of the
fork()-fest you're presumably seeing). For now your only choice, if
you want that plan to run, is to crank up max_connections so that the
total number of concurrently executing Gather nodes is less than about
64 + 2 * max_connections. There is also a crash bug right now in the
out-of-slots case as discussed, fixed in the next point release, but
even with that fix in place you'll still need a high enough
max_connections setting to be sure to be able to complete the query
without an error.

Thanks for the report!

--
Thomas Munro
http://www.enterprisedb.com

#14Jakub Glapa
jakub.glapa@gmail.com
In reply to: Thomas Munro (#13)
Re: ERROR: too many dynamic shared memory segments

Thank You Thomas!

--
regards,
Jakub Glapa

On Thu, Dec 7, 2017 at 10:30 PM, Thomas Munro <thomas.munro@enterprisedb.com

Show quoted text

wrote:

On Tue, Dec 5, 2017 at 1:18 AM, Jakub Glapa <jakub.glapa@gmail.com> wrote:

I see that the segfault is under active discussion but just wanted to

ask if

increasing the max_connections to mitigate the DSM slots shortage is the

way

to go?

Hi Jakub,

Yes. In future releases this situation will improve (maybe we'll
figure out how to use one DSM segment for all the gather nodes in your
query plan, and maybe it'll be moot anyway because maybe we'll be able
to use a Parallel Append for queries like yours so that it uses the
same set of workers over all the child plans instead of the
fork()-fest you're presumably seeing). For now your only choice, if
you want that plan to run, is to crank up max_connections so that the
total number of concurrently executing Gather nodes is less than about
64 + 2 * max_connections. There is also a crash bug right now in the
out-of-slots case as discussed, fixed in the next point release, but
even with that fix in place you'll still need a high enough
max_connections setting to be sure to be able to complete the query
without an error.

Thanks for the report!

--
Thomas Munro
http://www.enterprisedb.com

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#11)
Re: ERROR: too many dynamic shared memory segments

Robert Haas <robertmhaas@gmail.com> writes:

That turned out to produce more than one problem. I find that the
select_parallel test then fails like this:
ERROR: could not find key 18446744073709486082 in shm TOC at 0x10be98040
The fix for that problem seems to be:

/* Recreate error queues. */
error_queue_space =
-        shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, false);
+        shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, true);
+    Assert(pcxt->nworkers == 0 || error_queue_space != NULL);

I happened across this patch while preparing the release notes, and
I'm quite distressed by it, because it completely breaks the point
of what I'd done in commit d46633506 (to wit, to not just blindly
assume, nor just Assert, that shm_toc_lookup always succeeds).
Do you mind if I change that Assert to a run-time test?

regards, tom lane

#16Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#15)
Re: ERROR: too many dynamic shared memory segments

On Fri, Feb 2, 2018 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

That turned out to produce more than one problem. I find that the
select_parallel test then fails like this:
ERROR: could not find key 18446744073709486082 in shm TOC at 0x10be98040
The fix for that problem seems to be:

/* Recreate error queues. */
error_queue_space =
-        shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, false);
+        shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, true);
+    Assert(pcxt->nworkers == 0 || error_queue_space != NULL);

I happened across this patch while preparing the release notes, and
I'm quite distressed by it, because it completely breaks the point
of what I'd done in commit d46633506 (to wit, to not just blindly
assume, nor just Assert, that shm_toc_lookup always succeeds).
Do you mind if I change that Assert to a run-time test?

Hrm, I guess I could have done something like
shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, (pcxt->nworkers ==
0)).

I don't mind much if you change it, but I will note that for the
record, before d46633506, we had a theoretical source of bugs, whereas
after that commit, we had a bug. 445dbd82a fixed that; if you change
this around again, please take care not to make it buggy again.
Otherwise, I'll be the one who is quite distressed. :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#16)
Re: ERROR: too many dynamic shared memory segments

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Feb 2, 2018 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Do you mind if I change that Assert to a run-time test?

Hrm, I guess I could have done something like
shm_toc_lookup(pcxt->toc, PARALLEL_KEY_ERROR_QUEUE, (pcxt->nworkers ==
0)).

OK, that'd work too. I'll make it so.

regards, tom lane