Refresh Publication takes hours and doesn´t finish
*We use logical replication from a PG version 10.6 to a 11.2. Both are Ubuntu
16.04.We have a hundred schemas with more or less a hundred tables, so
number of tables is about 10.000. All replication is ok but when we try to
do a REFRESH SUBSCRIPTION because we added a new schema, it takes hours and
doesn´t finish. Then, if I go to our master server and do a select * from
pg_publication_tables it doesn´t respond too. Then, analysing the source of
view pg_publication_tables ...*
create view pg_publication_tables as SELECT p.pubname, n.nspname AS
schemaname, c.relname AS tablename FROM pg_publication p, (pg_class c JOIN
pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.oid IN (SELECT
pg_get_publication_tables.relid FROM pg_get_publication_tables((p.pubname)
:: text) pg_get_publication_tables (relid)));
If we run both statements of that view separately
SELECT string_agg(pg_get_publication_tables.relid::text,',') FROM
pg_get_publication_tables(('MyPublication')::text) pg_get_publication_tables
(relid);
*put all those oids retrieved on that IN of the view*
select * from pg_Class c JOIN pg_namespace n ON n.oid = c.relnamespace WHERE
c.oid IN (
*OIDs List*
);
*Then it responds immediatelly*
So, the question is .. can we change this view to select faster ? Just
rewriting that view to a better select will solve ?Is this view used by
REFRESH SUBSCRIPTION ? We think yes because if we run refresh subscription
or select from view it doesn´t respond, so ...
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
I tried sometime ago ... but with no responses, I ask you again.
pg_publication_tables is a view that is used to refresh publication, but as
we have 15.000 tables, it takes hours and doesn´t complete. If I change that
view I can have an immediate result. The question is: Can I change that view
? There is some trouble changing those system views ?
Original View is ...
create view pg_catalog.pg_publication_tables as
SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM
pg_publication p,
(pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
WHERE (c.oid IN (SELECT pg_get_publication_tables.relid FROM
pg_get_publication_tables((p.pubname)::text)
pg_get_publication_tables(relid)));
This way it takes 45 minutes to respond.
I changed it to ...
create or replace pg_catalog.view pg_publication_tables as SELECT p.pubname,
n.nspname AS schemaname, c.relname AS tablename from pg_publication p inner
join pg_get_publication_tables(p.pubname) pt on true inner join pg_class c
on pt.relid = c.oid inner join pg_namespace n ON (n.oid = c.relnamespace);
This one takes just one or two seconds.
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Em seg, 20 de mai de 2019 às 17:18, PegoraroF10 <marcos@f10.com.br>
escreveu:
I tried sometime ago ... but with no responses, I ask you again.
pg_publication_tables is a view that is used to refresh publication, but
as
we have 15.000 tables, it takes hours and doesn´t complete. If I change
that
view I can have an immediate result. The question is: Can I change that
view
? There is some trouble changing those system views ?
You really need a publication with a lot of relations??? If you can split
it in several publications your life should be easy.
Original View is ...
create view pg_catalog.pg_publication_tables as
SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM
pg_publication p,
(pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
WHERE (c.oid IN (SELECT pg_get_publication_tables.relid FROM
pg_get_publication_tables((p.pubname)::text)
pg_get_publication_tables(relid)));
This way it takes 45 minutes to respond.
I really don't know why we did it... because pg_get_publication_tables
doesn't have any special behavior different than get relations assigned to
publications.
I changed it to ...
create or replace pg_catalog.view pg_publication_tables as SELECT
p.pubname,
n.nspname AS schemaname, c.relname AS tablename from pg_publication p
inner
join pg_get_publication_tables(p.pubname) pt on true inner join pg_class c
on pt.relid = c.oid inner join pg_namespace n ON (n.oid = c.relnamespace);
This one takes just one or two seconds.
Even better, you can go direct by system catalogs:
SELECT p.pubname,
n.nspname AS schemaname,
c.relname AS tablename
FROM pg_publication p
JOIN pg_publication_rel pr ON pr.prpubid = p.oid
JOIN pg_class c ON c.oid = pr.prrelid
JOIN pg_namespace n ON n.oid = c.relnamespace;
To change it, before you'll need to set "allow_system_table_mods=on" and
restart PostgreSQL.
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
PegoraroF10 <marcos@f10.com.br> writes:
I tried sometime ago ... but with no responses, I ask you again.
pg_publication_tables is a view that is used to refresh publication, but as
we have 15.000 tables, it takes hours and doesn´t complete. If I change that
view I can have an immediate result. The question is: Can I change that view
? There is some trouble changing those system views ?
Original View is ...
create view pg_catalog.pg_publication_tables as
SELECT p.pubname, n.nspname AS schemaname, c.relname AS tablename FROM
pg_publication p,
(pg_class c JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
WHERE (c.oid IN (SELECT pg_get_publication_tables.relid FROM
pg_get_publication_tables((p.pubname)::text)
pg_get_publication_tables(relid)));
This way it takes 45 minutes to respond.
I changed it to ...
create or replace pg_catalog.view pg_publication_tables as SELECT p.pubname,
n.nspname AS schemaname, c.relname AS tablename from pg_publication p inner
join pg_get_publication_tables(p.pubname) pt on true inner join pg_class c
on pt.relid = c.oid inner join pg_namespace n ON (n.oid = c.relnamespace);
This one takes just one or two seconds.
Hmm ... given that pg_get_publication_tables() shouldn't return any
duplicate OIDs, it does seem unnecessarily inefficient to put it in
an IN-subselect condition. Peter, is there a reason why this isn't
a straight lateral join? I get a much saner-looking plan from
FROM pg_publication P, pg_class C
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
- WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname));
+ JOIN pg_namespace N ON (N.oid = C.relnamespace),
+ LATERAL pg_get_publication_tables(P.pubname)
+ WHERE C.oid = pg_get_publication_tables.relid;
regards, tom lane
Em seg, 20 de mai de 2019 às 18:30, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Hmm ... given that pg_get_publication_tables() shouldn't return any
duplicate OIDs, it does seem unnecessarily inefficient to put it in
an IN-subselect condition. Peter, is there a reason why this isn't
a straight lateral join? I get a much saner-looking plan fromFROM pg_publication P, pg_class C
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
- WHERE C.oid IN (SELECT relid FROM
pg_get_publication_tables(P.pubname));
+ JOIN pg_namespace N ON (N.oid = C.relnamespace), + LATERAL pg_get_publication_tables(P.pubname) + WHERE C.oid = pg_get_publication_tables.relid;
And why not just JOIN direct with pg_publication_rel ?
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
I cannot because we created a replication for ALL TABLES
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Restart Postgres means exactly what ? We tried just restart the service but
we tried to refresh publication the old view was used because it took 2hours
and gave us a timeout.
I found some people talking that I need to initdb, but initdb means recreate
entirely my database or just reinstall my postgres server ?
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Em ter, 21 de mai de 2019 às 14:17, PegoraroF10 <marcos@f10.com.br>
escreveu:
Restart Postgres means exactly what ? We tried just restart the service
but
we tried to refresh publication the old view was used because it took
2hours
and gave us a timeout.
As I said before to change system catalog you should set
"allow_system_table_mods=on" and restart PostgreSQL service.
After that you'll able to recreate the "pg_catalog.pg_publication_tables"
system view. (You can use the Tom's suggestion using LATERAL)
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabrizio@timbira.com.br> writes:
As I said before to change system catalog you should set
"allow_system_table_mods=on" and restart PostgreSQL service.
After that you'll able to recreate the "pg_catalog.pg_publication_tables"
system view. (You can use the Tom's suggestion using LATERAL)
It's a view, not a table, so I don't think you need
allow_system_table_mods. A quick test here says that being
superuser is enough to do a CREATE OR REPLACE VIEW on it.
regards, tom lane
Em ter, 21 de mai de 2019 às 14:41, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabrizio@timbira.com.br> writes:
As I said before to change system catalog you should set
"allow_system_table_mods=on" and restart PostgreSQL service.
After that you'll able to recreate the
"pg_catalog.pg_publication_tables"
system view. (You can use the Tom's suggestion using LATERAL)
It's a view, not a table, so I don't think you need
allow_system_table_mods. A quick test here says that being
superuser is enough to do a CREATE OR REPLACE VIEW on it.
Interesting, I tried the following commands and got error:
postgres=# SELECT version();
version
----------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled
by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
(1 row)
postgres=# SELECT session_user;
session_user
--------------
postgres
(1 row)
postgres=# SHOW allow_system_table_mods ;
allow_system_table_mods
-------------------------
off
(1 row)
postgres=# CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
postgres-# SELECT
postgres-# P.pubname AS pubname,
postgres-# N.nspname AS schemaname,
postgres-# C.relname AS tablename
postgres-# FROM pg_publication P, pg_class C
postgres-# JOIN pg_namespace N ON (N.oid = C.relnamespace),
postgres-# LATERAL pg_get_publication_tables(P.pubname)
postgres-# WHERE C.oid = pg_get_publication_tables.relid;
ERROR: permission denied: "pg_publication_tables" is a system catalog
But changing "allow_system_table_mods=on" works as expected:
postgres=# SHOW allow_system_table_mods ;
allow_system_table_mods
-------------------------
on
(1 row)
postgres=# CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
SELECT
P.pubname AS pubname,
N.nspname AS schemaname,
C.relname AS tablename
FROM pg_publication P, pg_class C
JOIN pg_namespace N ON (N.oid = C.relnamespace),
LATERAL pg_get_publication_tables(P.pubname)
WHERE C.oid = pg_get_publication_tables.relid;
CREATE VIEW
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabrizio@timbira.com.br> writes:
Em ter, 21 de mai de 2019 às 14:41, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
It's a view, not a table, so I don't think you need
allow_system_table_mods. A quick test here says that being
superuser is enough to do a CREATE OR REPLACE VIEW on it.
Interesting, I tried the following commands and got error:
Oh, huh, this is something that changed recently in HEAD ---
since commit 2d7d946cd, stuff created by system_views.sql
is not protected as though it were a system catalog.
So in released versions, yes you need allow_system_table_mods=on.
Sorry for the misinformation.
regards, tom lane
[ redirecting to pgsql-hackers as the more relevant list ]
I wrote:
PegoraroF10 <marcos@f10.com.br> writes:
I tried sometime ago ... but with no responses, I ask you again.
pg_publication_tables is a view that is used to refresh publication, but as
we have 15.000 tables, it takes hours and doesn't complete. If I change that
view I can have an immediate result. The question is: Can I change that view
? There is some trouble changing those system views ?
Hmm ... given that pg_get_publication_tables() shouldn't return any
duplicate OIDs, it does seem unnecessarily inefficient to put it in
an IN-subselect condition. Peter, is there a reason why this isn't
a straight lateral join? I get a much saner-looking plan from
FROM pg_publication P, pg_class C - JOIN pg_namespace N ON (N.oid = C.relnamespace) - WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname)); + JOIN pg_namespace N ON (N.oid = C.relnamespace), + LATERAL pg_get_publication_tables(P.pubname) + WHERE C.oid = pg_get_publication_tables.relid;
For the record, the attached seems like what to do here. It's easy
to show that there's a big performance gain even for normal numbers
of tables, eg if you do
CREATE PUBLICATION mypub FOR ALL TABLES;
SELECT * FROM pg_publication_tables;
in the regression database, the time for the select drops from ~360ms
to ~6ms on my machine. The existing view's performance will drop as
O(N^2) the more publishable tables you have ...
Given that this change impacts the regression test results, project
rules say that it should come with a catversion bump. Since we are
certainly going to have a catversion bump before beta2 because of
the pg_statistic_ext permissions business, that doesn't seem like
a reason not to push it into v12 --- any objections?
regards, tom lane
Attachments:
fix-pg_publication_tables-performance.patchtext/x-diff; charset=us-ascii; name=fix-pg_publication_tables-performance.patchDownload
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 566100d..52a6c31 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -258,9 +258,10 @@ CREATE VIEW pg_publication_tables AS
P.pubname AS pubname,
N.nspname AS schemaname,
C.relname AS tablename
- FROM pg_publication P, pg_class C
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
- WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname));
+ FROM pg_publication P,
+ LATERAL pg_get_publication_tables(P.pubname) GPT,
+ pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
+ WHERE C.oid = GPT.relid;
CREATE VIEW pg_locks AS
SELECT * FROM pg_lock_status() AS L;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0c392e5..4363ca1 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1441,10 +1441,10 @@ pg_publication_tables| SELECT p.pubname,
n.nspname AS schemaname,
c.relname AS tablename
FROM pg_publication p,
+ LATERAL pg_get_publication_tables((p.pubname)::text) gpt(relid),
(pg_class c
JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
- WHERE (c.oid IN ( SELECT pg_get_publication_tables.relid
- FROM pg_get_publication_tables((p.pubname)::text) pg_get_publication_tables(relid)));
+ WHERE (c.oid = gpt.relid);
pg_replication_origin_status| SELECT pg_show_replication_origin_status.local_id,
pg_show_replication_origin_status.external_id,
pg_show_replication_origin_status.remote_lsn,
On Tue, May 21, 2019 at 4:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ redirecting to pgsql-hackers as the more relevant list ]
I wrote:
PegoraroF10 <marcos@f10.com.br> writes:
I tried sometime ago ... but with no responses, I ask you again.
pg_publication_tables is a view that is used to refresh publication,
but as
we have 15.000 tables, it takes hours and doesn't complete. If I
change that
view I can have an immediate result. The question is: Can I change
that view
? There is some trouble changing those system views ?
Hmm ... given that pg_get_publication_tables() shouldn't return any
duplicate OIDs, it does seem unnecessarily inefficient to put it in
an IN-subselect condition. Peter, is there a reason why this isn't
a straight lateral join? I get a much saner-looking plan fromFROM pg_publication P, pg_class C
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
- WHERE C.oid IN (SELECT relid FROM
pg_get_publication_tables(P.pubname));
+ JOIN pg_namespace N ON (N.oid = C.relnamespace), + LATERAL pg_get_publication_tables(P.pubname) + WHERE C.oid = pg_get_publication_tables.relid;For the record, the attached seems like what to do here. It's easy
to show that there's a big performance gain even for normal numbers
of tables, eg if you doCREATE PUBLICATION mypub FOR ALL TABLES;
SELECT * FROM pg_publication_tables;in the regression database, the time for the select drops from ~360ms
to ~6ms on my machine. The existing view's performance will drop as
O(N^2) the more publishable tables you have ...Given that this change impacts the regression test results, project
rules say that it should come with a catversion bump. Since we are
certainly going to have a catversion bump before beta2 because of
the pg_statistic_ext permissions business, that doesn't seem like
a reason not to push it into v12 --- any objections?
I completely agree to push it into v12.
Regards,
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On 2019-05-20 23:30, Tom Lane wrote:
Hmm ... given that pg_get_publication_tables() shouldn't return any
duplicate OIDs, it does seem unnecessarily inefficient to put it in
an IN-subselect condition. Peter, is there a reason why this isn't
a straight lateral join? I get a much saner-looking plan fromFROM pg_publication P, pg_class C - JOIN pg_namespace N ON (N.oid = C.relnamespace) - WHERE C.oid IN (SELECT relid FROM pg_get_publication_tables(P.pubname)); + JOIN pg_namespace N ON (N.oid = C.relnamespace), + LATERAL pg_get_publication_tables(P.pubname) + WHERE C.oid = pg_get_publication_tables.relid;
No reason I think, just didn't quite manage to recognize the possibility
of using LATERAL at the time.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services