BUG #14378: when drop schema, standby will do many more than primary, and very slow, it's a bug?

Started by 德哥over 9 years ago5 messagesbugs
Jump to latest
#1德哥
digoal@126.com

The following bug has been logged on the website:

Bug reference: 14378
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6.0
Operating system: CentOS 6.x x64
Description:

when i use postgresql 9.6.0 for primary-standby, i drop a schema on primary,
but it's apply very slow on standby.

the wal send very fast, but apply very slow, i think the syscall more than
it on primary?

this is my test:

on primary
```

create table test(
c0 serial unique check(c0>0),
c1 serial unique check(c1>0),
c2 serial unique check(c2>0),
c3 serial unique check(c3>0),
c4 serial unique check(c4>0),
c5 serial unique check(c5>0),
c6 serial unique check(c6>0),
c7 serial unique check(c7>0),
c8 serial unique check(c8>0),
c9 serial unique check(c9>0),
c10 serial unique check(c10>0),
c11 serial unique check(c11>0),
c12 serial unique check(c12>0),
c13 serial unique check(c13>0),
c14 serial unique check(c14>0),
c15 serial unique check(c15>0),
c16 serial unique check(c16>0),
c17 serial unique check(c17>0),
c18 serial unique check(c18>0),
c19 serial unique check(c19>0),
c20 serial unique check(c20>0),
c21 serial unique check(c21>0),
c22 serial unique check(c22>0),
c23 serial unique check(c23>0),
c24 serial unique check(c24>0),
c25 serial unique check(c25>0),
c26 serial unique check(c26>0),
c27 serial unique check(c27>0),
c28 serial unique check(c28>0),
c29 serial unique check(c29>0),
c30 serial unique check(c30>0),
c31 serial unique check(c31>0),
c32 serial unique check(c32>0),
c33 serial unique check(c33>0),
c34 serial unique check(c34>0),
c35 serial unique check(c35>0),
c36 serial unique check(c36>0),
c37 serial unique check(c37>0),
c38 serial unique check(c38>0),
c39 serial unique check(c39>0),
c40 serial unique check(c40>0),
c41 serial unique check(c41>0),
c42 serial unique check(c42>0),
c43 serial unique check(c43>0),
c44 serial unique check(c44>0),
c45 serial unique check(c45>0),
c46 serial unique check(c46>0),
c47 serial unique check(c47>0),
c48 serial unique check(c48>0),
c49 serial unique check(c49>0),
c50 serial unique check(c50>0),
c51 serial unique check(c51>0),
c52 serial unique check(c52>0),
c53 serial unique check(c53>0),
c54 serial unique check(c54>0),
c55 serial unique check(c55>0),
c56 serial unique check(c56>0),
c57 serial unique check(c57>0),
c58 serial unique check(c58>0),
c59 serial unique check(c59>0),
c60 serial unique check(c60>0),
c61 serial unique check(c61>0),
c62 serial unique check(c62>0),
c63 serial unique check(c63>0),
c64 serial unique check(c64>0),
c65 serial unique check(c65>0),
c66 serial unique check(c66>0),
c67 serial unique check(c67>0),
c68 serial unique check(c68>0),
c69 serial unique check(c69>0),
c70 serial unique check(c70>0),
c71 serial unique check(c71>0),
c72 serial unique check(c72>0),
c73 serial unique check(c73>0),
c74 serial unique check(c74>0),
c75 serial unique check(c75>0),
c76 serial unique check(c76>0),
c77 serial unique check(c77>0),
c78 serial unique check(c78>0),
c79 serial unique check(c79>0),
c80 serial unique check(c80>0),
c81 serial unique check(c81>0),
c82 serial unique check(c82>0),
c83 serial unique check(c83>0),
c84 serial unique check(c84>0),
c85 serial unique check(c85>0),
c86 serial unique check(c86>0),
c87 serial unique check(c87>0),
c88 serial unique check(c88>0),
c89 serial unique check(c89>0),
c90 serial unique check(c90>0),
c91 serial unique check(c91>0),
c92 serial unique check(c92>0),
c93 serial unique check(c93>0),
c94 serial unique check(c94>0),
c95 serial unique check(c95>0),
c96 serial unique check(c96>0),
c97 serial unique check(c97>0),
c98 serial unique check(c98>0),
c99 serial unique check(c99>0)
);

do language plpgsql $$
declare
i int ;
begin
for i in 1..500 loop
execute 'create table test'||i||' (like test including all)';
end loop;
end;
$$;

postgres=# \l+
List of
databases
Name | Owner | Encoding | Collate | Ctype | Access privileges |
Size | Tablespace | Description
-----------+----------+----------+---------+-------+-----------------------+---------+------------+--------------------------------------------
postgres | postgres | UTF8 | C | C | |
578 MB | pg_default | default administrative connection database
template0 | postgres | UTF8 | C | C | =c/postgres +|
7225 kB | pg_default | unmodifiable empty database
| | | | | postgres=CTc/postgres |
| |
template1 | postgres | UTF8 | C | C | =c/postgres +|
7225 kB | pg_default | default template for new databases
| | | | | postgres=CTc/postgres |
| |
(3 rows)

\c template1
template1=# create database db0 with template postgres;

template1=# select pg_current_xlog_location();
pg_current_xlog_location
--------------------------
0/15EF55E0
(1 row)

\c db0

drop schema public cascade;

db0=# select pg_current_xlog_location();
-[ RECORD 1 ]------------+-----------
pg_current_xlog_location | 1/168E6EA8

```

on primary , monitory the send, write, flush, apply xlog delay
```

\x

select
pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),sent_location)),
pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),write_location)),
pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),flush_location)),
pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),replay_location)),

* from pg_stat_replication ;

\watch 1

-[ RECORD 1 ]----+------------------------------
pg_size_pretty | 0 bytes
pg_size_pretty | 0 bytes
pg_size_pretty | 0 bytes
pg_size_pretty | 56 MB
pid | 88042
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 127.0.0.1
client_hostname |
client_port | 19224
backend_start | 2016-10-18 15:35:49.460366+08
backend_xmin | 1699
state | streaming
sent_location | 0/24A97948
write_location | 0/24A97948
flush_location | 0/24A97948
replay_location | 0/212B1790
sync_priority | 0
sync_state | async

```

on standby
```
recovery process:

88040 87828 27 15:35 ? 00:02:56 postgres: startup process
recovering 000000010000000000000022

what to do?

strace -p 88040

unlink("base/218492/68038") = 0
unlink("base/218492/68038.1") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68038_fsm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68038_vm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68038_init") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68036") = 0
unlink("base/218492/68036.1") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68036_fsm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68036_vm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68036_init") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68034") = 0
unlink("base/218492/68034.1") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68034_fsm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68034_vm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68034_init") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68032") = 0
unlink("base/218492/68032.1") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68032_fsm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68032_vm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68032_init") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68030") = 0
unlink("base/218492/68030.1") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68030_fsm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68030_vm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68030_init") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68028") = 0
unlink("base/218492/68028.1") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68028_fsm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68028_vm") = -1 ENOENT (No such file or
directory)
unlink("base/218492/68028_init") = -1 ENOENT (No such file or
directory)

```
so this xlog record apply very slow, but why it's quick on primary?

the xlog record analyze :
```

pg_xlogdump -b 000000010000000000000021 000000010000000000000022 2>&1
|less
search 0/212B1790
this redo size 17680840 bytes, very large

rmgr: Transaction len (rec/tot): 17680840/17680869, tx: 1698, lsn:
0/212B1790, prev 0/212B0AE0, desc: COMMIT 2016-10-18 15:44:06.563838 CST;
rels:
base/218492/111694 base/218492/112095 base/218492/112093 base/218492/112091
base/218492/112089 base/218492/112087 base/218492/112085 base/218492/112083

base/218492/112081 base/218492/112079 base/218492/112077 base/218492/112075
base/218492/112073 base/218492/112071 base/218492/112069 base/218492/112067

base/218492/112065 base/218492/112063 base/218492/112061 base/218492/112059
base/218492/112057 base/218492/112055 base/218492/112053 base/218492/112051

base/218492/112049 base/218492/112047 base/218492/112045 base/218492/112043
base/218492/112041 base/218492/112039 base/218492/112037 base/218492/112035

base/218492/112033 base/218492/112031 base/218492/112029
............................
relcache 218305 relcache 218086 snapshot 2608 relcache 218086 snapshot 2608
relcache 218086 snapshot 2608 snapshot 2608 relcache 218303 relcache 218086

snapshot 2608 relcache 218086 snapshot 2608 relcache 218086 snapshot 2608
snapshot 26
08 relcache 218301 relcache 218086 snapshot 2608 relcache 218086 snapshot
2608 relcache 218086 snapshot 2608 snapshot 2608 relcache 218299 relcache
218086
snapshot 2608 relcache 218086 snapshot 2608 relcache 218086 snapshot 2608
snapshot
2608 relcache 218297 relcache 218086 snapshot 2608 relcache 218086 snapshot
2608 relcache 218086 snapshot 2608 snapshot 2608 relcache 218295 relcache
218086
snapshot 2608 relcache 218086 snapshot 2608 relcache 218086 snapshot 2608
snaps
hot 2608 relcache 218293 relcache 218086 snapshot 2608 relcache 218086
snapshot 2608 relcache 218086 snapshot 2608 snapshot 2608 relcache 218291
relcache 218086
snapshot 2608 relcache 218086 snapshot 2608 relcache 218086 snapshot 2608
sn
apshot 2608 relcache 218289 relcache 218086 snapshot 2608 snapshot 2608
snapshot 2608
..............
rmgr: Heap2 len (rec/tot): 8/ 62, tx: 0, lsn:
0/2239AC80, prev 0/212B1790, desc: CLEAN remxid 1698
blkref #0: rel 1663/218492/1259 fork main blk 3

```

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2John R Pierce
pierce@hogranch.com
In reply to: 德哥 (#1)
Re: BUG #14378: when drop schema, standby will do many more than primary, and very slow, it's a bug?

On 10/18/2016 12:57 AM, digoal@126.com wrote:

when i use postgresql 9.6.0 for primary-standby, i drop a schema on primary,
but it's apply very slow on standby.

the wal send very fast, but apply very slow, i think the syscall more than
it on primary?

your table has 100 associated sequences, and 100 associated indexes,
along with 100 expression constraints?, and you're creating 500 of
these? thats quite a lot of metadata.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Andres Freund
andres@anarazel.de
In reply to: 德哥 (#1)
Re: BUG #14378: when drop schema, standby will do many more than primary, and very slow, it's a bug?

On 2016-10-18 07:57:33 +0000, digoal@126.com wrote:

The following bug has been logged on the website:

Bug reference: 14378
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6.0
Operating system: CentOS 6.x x64
Description:

when i use postgresql 9.6.0 for primary-standby, i drop a schema on primary,
but it's apply very slow on standby.

the wal send very fast, but apply very slow, i think the syscall more than
it on primary?

this is my test:

There's some O(n^2) behaviour around lock release and acquire during
replay IIRC. See for example the thread around
http://archives.postgresql.org/message-id/20140630231741.GA30918%40alap3.anarazel.de

It's not entirely trivial to fix that though. E.g. the patch of Simon's
later in that thread wouldn't help in your situation.

Greetings,

Andres Freund

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4德哥
digoal@126.com
In reply to: Andres Freund (#3)
Re: BUG #14378: when drop schema, standby will do many more than primary, and very slow, it's a bug?

HI,
Thanks Andres Freund's replay.
Btw, schema not like database in file struct, it has no directory entry below the database directory.
database has a dir below tablespace dir. but schema no dir below the database dir .
so drop schema must deal files one-by-one?
but drop database can drop dir only?

best regards.

--
公益是一辈子的事,I'm Digoal,Just Do It.

At 2016-10-19 02:32:08, "Andres Freund" <andres@anarazel.de> wrote:

Show quoted text

On 2016-10-18 07:57:33 +0000, digoal@126.com wrote:

The following bug has been logged on the website:

Bug reference: 14378
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6.0
Operating system: CentOS 6.x x64
Description:

when i use postgresql 9.6.0 for primary-standby, i drop a schema on primary,
but it's apply very slow on standby.

the wal send very fast, but apply very slow, i think the syscall more than
it on primary?

this is my test:

There's some O(n^2) behaviour around lock release and acquire during
replay IIRC. See for example the thread around
http://archives.postgresql.org/message-id/20140630231741.GA30918%40alap3.anarazel.de

It's not entirely trivial to fix that though. E.g. the patch of Simon's
later in that thread wouldn't help in your situation.

Greetings,

Andres Freund

#5德哥
digoal@126.com
In reply to: Andres Freund (#3)
Re: BUG #14378: when drop schema, standby will do many more than primary, and very slow, it's a bug?

HI,
Add some info when i use perf top.

1\. when drop schema on primary (search depend)
```
findDependentObjects /home/digoal/pgsql9.6/bin/postgres
3745.00 11.3% get_tabstat_entry /home/digoal/pgsql9.6/bin/postgres
```

2\. when drop schema start (drop files)
```
5862.00 17.7% DropRelFileNodesAllBuffers /home/digoal/pgsql9.6/bin/postgres
4058.00 12.2% smgrclose /home/digoal/pgsql9.6/bin/postgres
```

3\. when apply on standby
```
12464.00 92.5% DropRelFileNodesAllBuffers /home/digoal/pgsql9.6/bin/postgres
56.00 0.6% DropRelFileNodeBuffers /home/digoal/pgsql9.6/bin/postgres
```

best regards.

--
公益是一辈子的事,I'm Digoal,Just Do It.

At 2016-10-19 02:32:08, "Andres Freund" <andres@anarazel.de> wrote:

Show quoted text

On 2016-10-18 07:57:33 +0000, digoal@126.com wrote:

The following bug has been logged on the website:

Bug reference: 14378
Logged by: Zhou Digoal
Email address: digoal@126.com
PostgreSQL version: 9.6.0
Operating system: CentOS 6.x x64
Description:

when i use postgresql 9.6.0 for primary-standby, i drop a schema on primary,
but it's apply very slow on standby.

the wal send very fast, but apply very slow, i think the syscall more than
it on primary?

this is my test:

There's some O(n^2) behaviour around lock release and acquire during
replay IIRC. See for example the thread around
http://archives.postgresql.org/message-id/20140630231741.GA30918%40alap3.anarazel.de

It's not entirely trivial to fix that though. E.g. the patch of Simon's
later in that thread wouldn't help in your situation.

Greetings,

Andres Freund