BUG #14159: PostgreSQL 9.6 parallel scan consume very high mutex lock

Started by 德哥about 10 years ago6 messagesbugs

digoal@126.com

about 10 years ago

The following bug has been logged on the website:

Bug reference: 14159
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6beta1
Operating system: CentOS 6.x x64
Description:

My test environment:
CentOS 6
64 CORE cpu

this is not use parallel
postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.335 ms

postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)
Time: 141377.100 ms

and the perftop is
```
PerfTop: 1075 irqs/sec kernel:21.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

samples pcnt function DSO
_______ _____ ________________________________
_________________________________________________________________________________

2493.00 15.2% heap_getnext
/home/digoal/pgsql9.6/bin/postgres

1988.00 12.1% copy_user_generic_string
[kernel.kallsyms]

1265.00 7.7% advance_aggregates
/home/digoal/pgsql9.6/bin/postgres

931.00 5.7% ExecProject
/home/digoal/pgsql9.6/bin/postgres

891.00 5.4% heapgetpage
/home/digoal/pgsql9.6/bin/postgres

864.00 5.3% HeapTupleSatisfiesMVCC
/home/digoal/pgsql9.6/bin/postgres

744.00 4.5% ExecProcNode
/home/digoal/pgsql9.6/bin/postgres

717.00 4.4% advance_transition_function
/home/digoal/pgsql9.6/bin/postgres

697.00 4.2% ExecScan
/home/digoal/pgsql9.6/bin/postgres

581.00 3.5% SeqNext
/home/digoal/pgsql9.6/bin/postgres

576.00 3.5% ExecStoreTuple
/home/digoal/pgsql9.6/bin/postgres

446.00 2.7% ExecClearTuple
/home/digoal/pgsql9.6/bin/postgres

378.00 2.3% MemoryContextReset
/home/digoal/pgsql9.6/bin/postgres

364.00 2.2% hash_search_with_hash_value
/home/digoal/pgsql9.6/bin/postgres

364.00 2.2% ExecAgg
/home/digoal/pgsql9.6/bin/postgres

307.00 1.9% CheckForSerializableConflictOut
/home/digoal/pgsql9.6/bin/postgres

296.00 1.8% fetch_input_tuple
/home/digoal/pgsql9.6/bin/postgres

217.00 1.3% int8inc
/home/digoal/pgsql9.6/bin/postgres

217.00 1.3% XidInMVCCSnapshot
/home/digoal/pgsql9.6/bin/postgres

169.00 1.0% _mdfd_getseg
/home/digoal/pgsql9.6/bin/postgres

```
And when use 64 parallel

postgres=# alter table t_bit2 set (parallel_degree=64);
ALTER TABLE
Time: 0.180 ms
postgres=# explain select count(*) from t_bit2 ;
QUERY PLAN

-----------------------------------------------------------------------------------------------
Finalize Aggregate (cost=12108923.92..12108923.93 rows=1 width=8)
-> Gather (cost=12108917.35..12108923.76 rows=64 width=8)
Workers Planned: 64
-> Partial Aggregate (cost=12107917.35..12107917.36 rows=1
width=8)
-> Parallel Seq Scan on t_bit2 (cost=0.00..12046131.88
rows=24714188 width=0)
(5 rows)

Time: 0.338 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)

Time: 37181.297 ms

the perftop is
```
PerfTop: 47649 irqs/sec kernel:83.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________

436271.00 76.5% __mutex_lock_slowpath [kernel.kallsyms]

13897.00 2.4% _spin_lock [kernel.kallsyms]

11159.00 2.0% heap_getnext
/home/digoal/pgsql9.6/bin/postgres
6929.00 1.2% copy_user_generic_string [kernel.kallsyms]

6497.00 1.1% advance_aggregates
/home/digoal/pgsql9.6/bin/postgres
5786.00 1.0% mutex_lock [kernel.kallsyms]

5133.00 0.9% mutex_unlock [kernel.kallsyms]

4862.00 0.9% ExecProject
/home/digoal/pgsql9.6/bin/postgres
4351.00 0.8% advance_transition_function
/home/digoal/pgsql9.6/bin/postgres
4130.00 0.7% LWLockAcquire
/home/digoal/pgsql9.6/bin/postgres
4119.00 0.7% heapgetpage
/home/digoal/pgsql9.6/bin/postgres
3751.00 0.7% ExecScan
/home/digoal/pgsql9.6/bin/postgres
3275.00 0.6% ExecProcNode
/home/digoal/pgsql9.6/bin/postgres
3159.00 0.6% HeapTupleSatisfiesMVCC
/home/digoal/pgsql9.6/bin/postgres
3123.00 0.5% mutex_spin_on_owner [kernel.kallsyms]

```

when i set paralle degree to 16

postgres=# alter table t_bit2 set (parallel_degree=16);
ALTER TABLE
Time: 0.433 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)

Time: 9534.304 ms

the perf top is
```
PerfTop: 16436 irqs/sec kernel:21.9% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________

10020.00 11.0% heap_getnext
/home/dege.zzz/pgsql9.6/bin/postgres
9804.00 10.8% copy_user_generic_string [kernel.kallsyms]

6239.00 6.9% advance_aggregates
/home/dege.zzz/pgsql9.6/bin/postgres
4666.00 5.1% ExecProject
/home/dege.zzz/pgsql9.6/bin/postgres
4496.00 5.0% heapgetpage
/home/dege.zzz/pgsql9.6/bin/postgres
4419.00 4.9% HeapTupleSatisfiesMVCC
/home/dege.zzz/pgsql9.6/bin/postgres
3891.00 4.3% ExecClearTuple
/home/dege.zzz/pgsql9.6/bin/postgres
3703.00 4.1% ExecProcNode
/home/dege.zzz/pgsql9.6/bin/postgres
3620.00 4.0% advance_transition_function
/home/dege.zzz/pgsql9.6/bin/postgres
3385.00 3.7% ExecScan
/home/dege.zzz/pgsql9.6/bin/postgres
2855.00 3.1% ExecStoreTuple
/home/dege.zzz/pgsql9.6/bin/postgres
2645.00 2.9% SeqNext
/home/dege.zzz/pgsql9.6/bin/postgres
```

why?

best regards,
digoal

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Amit Kapila

amit.kapila16@gmail.com

about 10 years ago

In reply to: 德哥 (#1)

Re: BUG #14159: PostgreSQL 9.6 parallel scan consume very high mutex lock

On Wed, May 25, 2016 at 5:21 PM, <digoal@126.com> wrote:

The following bug has been logged on the website:

Bug reference: 14159
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6beta1
Operating system: CentOS 6.x x64
Description:

My test environment:
CentOS 6
64 CORE cpu

this is not use parallel
postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.335 ms

postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)
Time: 141377.100 ms

and the perftop is
```
PerfTop: 1075 irqs/sec kernel:21.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

samples pcnt function DSO
_______ _____ ________________________________

_________________________________________________________________________________

2493.00 15.2% heap_getnext
/home/digoal/pgsql9.6/bin/postgres

```
And when use 64 parallel

postgres=# alter table t_bit2 set (parallel_degree=64);
ALTER TABLE
Time: 0.180 ms
postgres=# explain select count(*) from t_bit2 ;
QUERY PLAN

-----------------------------------------------------------------------------------------------
Finalize Aggregate (cost=12108923.92..12108923.93 rows=1 width=8)
-> Gather (cost=12108917.35..12108923.76 rows=64 width=8)
Workers Planned: 64
-> Partial Aggregate (cost=12107917.35..12107917.36 rows=1
width=8)
-> Parallel Seq Scan on t_bit2 (cost=0.00..12046131.88
rows=24714188 width=0)
(5 rows)

Time: 0.338 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)

Time: 37181.297 ms

the perftop is
```
PerfTop: 47649 irqs/sec kernel:83.5% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________

436271.00 76.5% __mutex_lock_slowpath
[kernel.kallsyms]

13897.00 2.4% _spin_lock
[kernel.kallsyms]

11159.00 2.0% heap_getnext
/home/digoal/pgsql9.6/bin/postgres

```

when i set paralle degree to 16

postgres=# alter table t_bit2 set (parallel_degree=16);
ALTER TABLE
Time: 0.433 ms
postgres=# select count(*) from t_bit2 ;
count
------------
1600000000
(1 row)

Time: 9534.304 ms

the perf top is
```
PerfTop: 16436 irqs/sec kernel:21.9% exact: 0.0% [1000Hz cycles],
(all, 64 CPUs)

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

samples pcnt function DSO
_______ _____ _______________________________
______________________________________________________________________

10020.00 11.0% heap_getnext
/home/dege.zzz/pgsql9.6/bin/postgres
9804.00 10.8% copy_user_generic_string
[kernel.kallsyms]

```

why?

Can we get the detailed call stack so that we can know from where the
spinlock call is initiated? One guess is that it could be due to the
contention on parallel_scan->phs_mutex spinlock which is acquired to move
to next heap page, but it is difficult to say from the information
provided. In general, I think using more than required workers can lead to
slow down in system and this seems to be one example of same. By the way,
what is the value selected by system for parallel workers if you don't set
parallel_degree parameter for a relation, you can check that by setting
default value (-1) of parallel_degree for a relation and
max_parallel_degree = 64 and max_worker_processes = 64.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Amit Kapila

amit.kapila16@gmail.com

about 10 years ago

In reply to: 德哥 (#1)

Re: BUG #14159: PostgreSQL 9.6 parallel scan consume very high mutex lock

On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:

This is worker process's stack, when i test the hign parallel degree.

[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Above call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn't work
very well read-ahead mechanism of OS. I think the best gain is possible
only when we use parallel degree which is most appropriate for the work
load, so unless you are sure that higher number of workers are required for
your workload, use system's default behaviour which is to set the
max_parallel_degree and let system decide how many workers are required.
Right now, we don't have very sophisticated way of defining how many
workers are appropriate for any workload, but I think it will work better
than directly setting parallel degree for a relation to very high number.

Note - Always keep the appropriate pg mailing list in cc for communication
which in this case is pgsql-bugs@postgresql.org.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Import Notes

Reply to msg id not found: 46e3b863.717e.154eb8d8e5d.Coremail.digoal@126.com

Andres Freund

andres@anarazel.de

about 10 years ago

In reply to: Amit Kapila (#3)

Re: BUG #14159: PostgreSQL 9.6 parallel scan consume very high mutex lock

On 2016-05-27 05:43:00 +0530, Amit Kapila wrote:

On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:

This is worker process's stack, when i test the hign parallel degree.

[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Above call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn't work
very well read-ahead mechanism of OS.

I don't think that's the correct conclusion. That report and profiles
rather seems to suggest we're hitting lock contention, rather than IO
related cost.

The kernel used here is quite old (heavily patched 2.6.32 IIRC). The
kernel guys have since made lseek not take locks in the common case. I
suspect that upgrading to a newer kernel will change the profile
significantly.

Greetings,

Andres Freund

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Amit Kapila

amit.kapila16@gmail.com

about 10 years ago

In reply to: Andres Freund (#4)

Re: BUG #14159: PostgreSQL 9.6 parallel scan consume very high mutex lock

On Fri, May 27, 2016 at 5:51 AM, Andres Freund <andres@anarazel.de> wrote:

On 2016-05-27 05:43:00 +0530, Amit Kapila wrote:

On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:

This is worker process's stack, when i test the hign parallel degree.

[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Above call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn't

work

very well read-ahead mechanism of OS.

I don't think that's the correct conclusion. That report and profiles
rather seems to suggest we're hitting lock contention, rather than IO
related cost.

The kernel used here is quite old (heavily patched 2.6.32 IIRC). The
kernel guys have since made lseek not take locks in the common case. I
suspect that upgrading to a newer kernel will change the profile
significantly.

Worth trying, however I don't think we can discount the fact that using
such large number of workers can saturate I/O channel.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Andres Freund

andres@anarazel.de

about 10 years ago

In reply to: Amit Kapila (#5)

Re: BUG #14159: PostgreSQL 9.6 parallel scan consume very high mutex lock

On 2016-05-27 11:27:59 +0530, Amit Kapila wrote:

On Fri, May 27, 2016 at 5:51 AM, Andres Freund <andres@anarazel.de> wrote:

On 2016-05-27 05:43:00 +0530, Amit Kapila wrote:

On Thu, May 26, 2016 at 11:03 AM, 德哥 <digoal@126.com> wrote:

This is worker process's stack, when i test the hign parallel degree.

[<ffffffffa00b8ff0>] ext4_llseek+0x60/0x110 [ext4]
[<ffffffff81186eda>] vfs_llseek+0x3a/0x40
[<ffffffff81188b96>] sys_lseek+0x66/0x80
[<ffffffff8100c072>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Above call stack indicates that the file seek cost has increased on
employing high number of workers. I think the reason is that employing
more number of workers to perform parallel scan of same file doesn't

work

very well read-ahead mechanism of OS.

I don't think that's the correct conclusion. That report and profiles
rather seems to suggest we're hitting lock contention, rather than IO
related cost.

The kernel used here is quite old (heavily patched 2.6.32 IIRC). The
kernel guys have since made lseek not take locks in the common case. I
suspect that upgrading to a newer kernel will change the profile
significantly.

Worth trying, however I don't think we can discount the fact that using
such large number of workers can saturate I/O channel.

Sure, that can be an issue. But if it were the bottleneck we wouldn't
see contention in a kernel spinlock, below lseek() (which in a postgres
workload pretty much never has to do IO).

Greetings,

Andres Freund

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs