BUG #14159: PostgreSQL 9.6 parallel scan consume very high mutex lock
The following bug has been logged on the website:
Bug reference: 14159
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6beta1
Operating system: CentOS 6.x x64
Description:
My test environment:
CentOS 6
64 CORE cpu
this is not use parallel
postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.335 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)
Time: 141377.100 ms
and the perftop is
```
PerfTop: 1075 irqs/sec kernel:21.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ________________________________
_________________________________________________________________________________
2493.00 15.2% heap_getnext
/home/digoal/pgsql9.6/bin/postgres
1988.00 12.1% copy_user_generic_string
[kernel.kallsyms]
1265.00 7.7% advance_aggregates
/home/digoal/pgsql9.6/bin/postgres
931.00 5.7% ExecProject
/home/digoal/pgsql9.6/bin/postgres
891.00 5.4% heapgetpage
/home/digoal/pgsql9.6/bin/postgres
864.00 5.3% HeapTupleSatisfiesMVCC
/home/digoal/pgsql9.6/bin/postgres
744.00 4.5% ExecProcNode
/home/digoal/pgsql9.6/bin/postgres
717.00 4.4% advance_transition_function
/home/digoal/pgsql9.6/bin/postgres
697.00 4.2% ExecScan
/home/digoal/pgsql9.6/bin/postgres
581.00 3.5% SeqNext
/home/digoal/pgsql9.6/bin/postgres
576.00 3.5% ExecStoreTuple
/home/digoal/pgsql9.6/bin/postgres
446.00 2.7% ExecClearTuple
/home/digoal/pgsql9.6/bin/postgres
378.00 2.3% MemoryContextReset
/home/digoal/pgsql9.6/bin/postgres
364.00 2.2% hash_search_with_hash_value
/home/digoal/pgsql9.6/bin/postgres
364.00 2.2% ExecAgg
/home/digoal/pgsql9.6/bin/postgres
307.00 1.9% CheckForSerializableConflictOut
/home/digoal/pgsql9.6/bin/postgres
296.00 1.8% fetch_input_tuple
/home/digoal/pgsql9.6/bin/postgres
217.00 1.3% int8inc
/home/digoal/pgsql9.6/bin/postgres
217.00 1.3% XidInMVCCSnapshot
/home/digoal/pgsql9.6/bin/postgres
169.00 1.0% _mdfd_getseg
/home/digoal/pgsql9.6/bin/postgres
```
And when use 64 parallel
postgres=# alter table t_bit2 set (parallel_degree=64);
ALTER TABLE
Time: 0.180 ms
postgres=# explain select count(*) from t_bit2 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Finalize Aggregate (cost=12108923.92..12108923.93 rows=1 width=8)
-> Gather (cost=12108917.35..12108923.76 rows=64 width=8)
Workers Planned: 64
-> Partial Aggregate (cost=12107917.35..12107917.36 rows=1
width=8)
-> Parallel Seq Scan on t_bit2 (cost=0.00..12046131.88
rows=24714188 width=0)
(5 rows)
Time: 0.338 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)
Time: 37181.297 ms
the perftop is
```
PerfTop: 47649 irqs/sec kernel:83.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________
436271.00 76.5% __mutex_lock_slowpath [kernel.kallsyms]
13897.00 2.4% _spin_lock [kernel.kallsyms]
11159.00 2.0% heap_getnext
/home/digoal/pgsql9.6/bin/postgres
6929.00 1.2% copy_user_generic_string [kernel.kallsyms]
6497.00 1.1% advance_aggregates
/home/digoal/pgsql9.6/bin/postgres
5786.00 1.0% mutex_lock [kernel.kallsyms]
5133.00 0.9% mutex_unlock [kernel.kallsyms]
4862.00 0.9% ExecProject
/home/digoal/pgsql9.6/bin/postgres
4351.00 0.8% advance_transition_function
/home/digoal/pgsql9.6/bin/postgres
4130.00 0.7% LWLockAcquire
/home/digoal/pgsql9.6/bin/postgres
4119.00 0.7% heapgetpage
/home/digoal/pgsql9.6/bin/postgres
3751.00 0.7% ExecScan
/home/digoal/pgsql9.6/bin/postgres
3275.00 0.6% ExecProcNode
/home/digoal/pgsql9.6/bin/postgres
3159.00 0.6% HeapTupleSatisfiesMVCC
/home/digoal/pgsql9.6/bin/postgres
3123.00 0.5% mutex_spin_on_owner [kernel.kallsyms]
```
when i set paralle degree to 16
postgres=# alter table t_bit2 set (parallel_degree=16);
ALTER TABLE
Time: 0.433 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)
Time: 9534.304 ms
the perf top is
```
PerfTop: 16436 irqs/sec kernel:21.9% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________
10020.00 11.0% heap_getnext
/home/dege.zzz/pgsql9.6/bin/postgres
9804.00 10.8% copy_user_generic_string [kernel.kallsyms]
6239.00 6.9% advance_aggregates
/home/dege.zzz/pgsql9.6/bin/postgres
4666.00 5.1% ExecProject
/home/dege.zzz/pgsql9.6/bin/postgres
4496.00 5.0% heapgetpage
/home/dege.zzz/pgsql9.6/bin/postgres
4419.00 4.9% HeapTupleSatisfiesMVCC
/home/dege.zzz/pgsql9.6/bin/postgres
3891.00 4.3% ExecClearTuple
/home/dege.zzz/pgsql9.6/bin/postgres
3703.00 4.1% ExecProcNode
/home/dege.zzz/pgsql9.6/bin/postgres
3620.00 4.0% advance_transition_function
/home/dege.zzz/pgsql9.6/bin/postgres
3385.00 3.7% ExecScan
/home/dege.zzz/pgsql9.6/bin/postgres
2855.00 3.1% ExecStoreTuple
/home/dege.zzz/pgsql9.6/bin/postgres
2645.00 2.9% SeqNext
/home/dege.zzz/pgsql9.6/bin/postgres
```
why?
best regards,
digoal
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Wed, May 25, 2016 at 5:21 PM, <digoal@126.com> wrote:
The following bug has been logged on the website:
Bug reference: 14159
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6beta1
Operating system: CentOS 6.x x64
Description:My test environment:
CentOS 6
64 CORE cputhis is not use parallel
postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.335 mspostgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)
Time: 141377.100 msand the perftop is
```
PerfTop: 1075 irqs/sec kernel:21.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _________________________________________________________________________________________________________________
2493.00 15.2% heap_getnext
/home/digoal/pgsql9.6/bin/postgres```
And when use 64 parallelpostgres=# alter table t_bit2 set (parallel_degree=64);
ALTER TABLE
Time: 0.180 ms
postgres=# explain select count(*) from t_bit2 ;
QUERY PLAN-----------------------------------------------------------------------------------------------
Finalize Aggregate (cost=12108923.92..12108923.93 rows=1 width=8)
-> Gather (cost=12108917.35..12108923.76 rows=64 width=8)
Workers Planned: 64
-> Partial Aggregate (cost=12107917.35..12107917.36 rows=1
width=8)
-> Parallel Seq Scan on t_bit2 (cost=0.00..12046131.88
rows=24714188 width=0)
(5 rows)Time: 0.338 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)Time: 37181.297 ms
the perftop is
```
PerfTop: 47649 irqs/sec kernel:83.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________436271.00 76.5% __mutex_lock_slowpath
[kernel.kallsyms]13897.00 2.4% _spin_lock
[kernel.kallsyms]11159.00 2.0% heap_getnext
/home/digoal/pgsql9.6/bin/postgres```
when i set paralle degree to 16
postgres=# alter table t_bit2 set (parallel_degree=16);
ALTER TABLE
Time: 0.433 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)Time: 9534.304 ms
the perf top is
```
PerfTop: 16436 irqs/sec kernel:21.9% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________10020.00 11.0% heap_getnext
/home/dege.zzz/pgsql9.6/bin/postgres
9804.00 10.8% copy_user_generic_string
[kernel.kallsyms]```
why?
Can we get the detailed call stack so that we can know from where the
spinlock call is initiated? One guess is that it could be due to the
contention on parallel_scan->phs_mutex spinlock which is acquired to move
to next heap page, but it is difficult to say from the information
provided. In general, I think using more than required workers can lead to
slow down in system and this seems to be one example of same. By the way,
what is the value selected by system for parallel workers if you don't set
parallel_degree parameter for a relation, you can check that by setting
default value (-1) of parallel_degree for a relation and
max_parallel_degree = 64 and max_worker_processes = 64.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:
This is worker process's stack, when i test the hign parallel degree.
[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
Above call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn't work
very well read-ahead mechanism of OS. I think the best gain is possible
only when we use parallel degree which is most appropriate for the work
load, so unless you are sure that higher number of workers are required for
your workload, use system's default behaviour which is to set the
max_parallel_degree and let system decide how many workers are required.
Right now, we don't have very sophisticated way of defining how many
workers are appropriate for any workload, but I think it will work better
than directly setting parallel degree for a relation to very high number.
Note - Always keep the appropriate pg mailing list in cc for communication
which in this case is pgsql-bugs@postgresql.org.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Import Notes
Reply to msg id not found: 46e3b863.717e.154eb8d8e5d.Coremail.digoal@126.com
On 2016-05-27 05:43:00 +0530, Amit Kapila wrote:
On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:
This is worker process's stack, when i test the hign parallel degree.
[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffffAbove call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn't work
very well read-ahead mechanism of OS.
I don't think that's the correct conclusion. That report and profiles
rather seems to suggest we're hitting lock contention, rather than IO
related cost.
The kernel used here is quite old (heavily patched 2.6.32 IIRC). The
kernel guys have since made lseek not take locks in the common case. I
suspect that upgrading to a newer kernel will change the profile
significantly.
Greetings,
Andres Freund
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Fri, May 27, 2016 at 5:51 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-05-27 05:43:00 +0530, Amit Kapila wrote:
On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:
This is worker process's stack, when i test the hign parallel degree.
[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffffAbove call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn't
work
very well read-ahead mechanism of OS.
I don't think that's the correct conclusion. That report and profiles
rather seems to suggest we're hitting lock contention, rather than IO
related cost.The kernel used here is quite old (heavily patched 2.6.32 IIRC). The
kernel guys have since made lseek not take locks in the common case. I
suspect that upgrading to a newer kernel will change the profile
significantly.
Worth trying, however I don't think we can discount the fact that using
such large number of workers can saturate I/O channel.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 2016-05-27 11:27:59 +0530, Amit Kapila wrote:
On Fri, May 27, 2016 at 5:51 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-05-27 05:43:00 +0530, Amit Kapila wrote:
On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:
This is worker process's stack, when i test the hign parallel degree.
[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffffAbove call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn'twork
very well read-ahead mechanism of OS.
I don't think that's the correct conclusion. That report and profiles
rather seems to suggest we're hitting lock contention, rather than IO
related cost.The kernel used here is quite old (heavily patched 2.6.32 IIRC). The
kernel guys have since made lseek not take locks in the common case. I
suspect that upgrading to a newer kernel will change the profile
significantly.Worth trying, however I don't think we can discount the fact that using
such large number of workers can saturate I/O channel.
Sure, that can be an issue. But if it were the bottleneck we wouldn't
see contention in a kernel spinlock, below lseek() (which in a postgres
workload pretty much never has to do IO).
Greetings,
Andres Freund
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs