logical replication worker and statistics

Started by Fujii Masaoalmost 9 years ago11 messages
#1Fujii Masao
masao.fujii@gmail.com

Hi,

Logical replication worker should call pgstat_report_stat()?
Currently it doesn't seem to do that and no statistics about
table accesses by logical replication workers are collected.
For example, this can prevent autovacuum from working on
those tables properly.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#1)
Re: logical replication worker and statistics

On Mon, Mar 27, 2017 at 11:14 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

Logical replication worker should call pgstat_report_stat()?
Currently it doesn't seem to do that and no statistics about
table accesses by logical replication workers are collected.
For example, this can prevent autovacuum from working on
those tables properly.

Yeah, that doesn't sound good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Stas Kelvich
s.kelvich@postgrespro.ru
In reply to: Robert Haas (#2)
Re: logical replication worker and statistics

On 27 Mar 2017, at 18:59, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 27, 2017 at 11:14 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

Logical replication worker should call pgstat_report_stat()?
Currently it doesn't seem to do that and no statistics about
table accesses by logical replication workers are collected.
For example, this can prevent autovacuum from working on
those tables properly.

Yeah, that doesn't sound good.

Seems that nobody is working on this, so i’m going to create the patch.

Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Noah Misch
noah@leadboat.com
In reply to: Stas Kelvich (#3)
Re: logical replication worker and statistics

On Wed, Apr 05, 2017 at 05:02:18PM +0300, Stas Kelvich wrote:

On 27 Mar 2017, at 18:59, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 27, 2017 at 11:14 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

Logical replication worker should call pgstat_report_stat()?
Currently it doesn't seem to do that and no statistics about
table accesses by logical replication workers are collected.
For example, this can prevent autovacuum from working on
those tables properly.

Yeah, that doesn't sound good.

Seems that nobody is working on this, so i’m going to create the patch.

[Action required within three days. This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item. Peter,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

[1]: /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Stas Kelvich
s.kelvich@postgrespro.ru
In reply to: Noah Misch (#4)
2 attachment(s)
Re: logical replication worker and statistics

On 10 Apr 2017, at 05:20, Noah Misch <noah@leadboat.com> wrote:

On Wed, Apr 05, 2017 at 05:02:18PM +0300, Stas Kelvich wrote:

On 27 Mar 2017, at 18:59, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Mar 27, 2017 at 11:14 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

Logical replication worker should call pgstat_report_stat()?
Currently it doesn't seem to do that and no statistics about
table accesses by logical replication workers are collected.
For example, this can prevent autovacuum from working on
those tables properly.

Yeah, that doesn't sound good.

Seems that nobody is working on this, so i’m going to create the patch.

[Action required within three days. This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item. Peter,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

[1] /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com

Here is small patch to call statistics in logical worker. Originally i thought that stat
collection during logical replication should manually account amounts of changed tuples,
but seems that it is already smoothly handled on relation level. So call to
pgstat_report_stat() is enough.

Also i’ve added statistics checks to logrep tap tests, but that is probably quite fragile
without something like wait_for_stats() from regression test stats.sql.

Attachments:

call_pgstat_report_stat.diffapplication/octet-stream; name=call_pgstat_report_stat.diff; x-unix-mode=0644Download
commit 1f55c068ed102fbcf1a2e18cbac4c128b212c583
Author: Stas Kelvich <stas.kelvich@gmail.com>
Date:   Mon Apr 10 12:30:43 2017 +0300

    call pgstat_report_stat in logical worker

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 3fb57f0..82a2046 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -769,10 +769,11 @@ allow_immediate_pgstat_restart(void)
 /* ----------
  * pgstat_report_stat() -
  *
- *	Called from tcop/postgres.c to send the so far collected per-table
- *	and function usage statistics to the collector.  Note that this is
- *	called only when not within a transaction, so it is fair to use
- *	transaction stop time as an approximation of current time.
+ *	Must be called by processes that performs DML: tcop/postgres.c, logical
+ *	receiver processes, SPI worker, etc to send the so far collected per-table
+ *	and function usage statistics to the collector. Note that this is called
+ *	only when not within a transaction, so it is fair to use transaction stop
+ *	time as an approximation of current time.
  * ----------
  */
 void
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 8a984a4..23a1fbb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -462,6 +462,7 @@ apply_handle_commit(StringInfo s)
 	/* Process any tables that are being synchronized in parallel. */
 	process_syncing_tables(commit_data.end_lsn);
 
+	pgstat_report_stat(false);
 	pgstat_report_activity(STATE_IDLE, NULL);
 }
 
logical_worker_stats_test.diffapplication/octet-stream; name=logical_worker_stats_test.diff; x-unix-mode=0644Download
commit 96a870f1f0b01df9b57482281f277903cacf707f
Author: Stas Kelvich <stas.kelvich@gmail.com>
Date:   Mon Apr 10 12:31:10 2017 +0300

    stats test

diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index d1817f5..257e680 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 14;
+use Test::More tests => 15;
 
 # Initialize publisher node
 my $node_publisher = get_new_node('publisher');
@@ -87,6 +87,11 @@ $node_publisher->safe_psql('postgres',
 $node_publisher->poll_query_until('postgres', $caughtup_query)
   or die "Timed out while waiting for subscriber to catch up";
 
+# check stats on subscriber side
+my $subscriber_stats = $node_subscriber->safe_psql('postgres',
+	"select n_tup_ins, n_live_tup from pg_catalog.pg_stat_user_tables where relname='tab_full'");
+is($subscriber_stats, qq(10|10), 'check proper statistics on subscriber');
+
 $result =
   $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_ins");
 is($result, qq(1052|1|1002), 'check replicated inserts on subscriber');
#6Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Stas Kelvich (#5)
Re: logical replication worker and statistics

On 4/10/17 05:49, Stas Kelvich wrote:

Here is small patch to call statistics in logical worker. Originally i thought that stat
collection during logical replication should manually account amounts of changed tuples,
but seems that it is already smoothly handled on relation level. So call to
pgstat_report_stat() is enough.

I wonder whether we need a similar call somewhere in tablesync.c. It
seems to work without it, though.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Noah Misch (#4)
Re: logical replication worker and statistics

On 4/9/17 22:20, Noah Misch wrote:

The above-described topic is currently a PostgreSQL 10 open item. Peter,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

Discussion is ongoing, patch is proposed, should be resolved this week.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Stas Kelvich
s.kelvich@postgrespro.ru
In reply to: Peter Eisentraut (#6)
Re: logical replication worker and statistics

On 10 Apr 2017, at 19:50, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 4/10/17 05:49, Stas Kelvich wrote:

Here is small patch to call statistics in logical worker. Originally i thought that stat
collection during logical replication should manually account amounts of changed tuples,
but seems that it is already smoothly handled on relation level. So call to
pgstat_report_stat() is enough.

I wonder whether we need a similar call somewhere in tablesync.c. It
seems to work without it, though.

I thought it spins up the same worker from worker.c.

Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Stas Kelvich (#8)
Re: logical replication worker and statistics

On 4/10/17 13:06, Stas Kelvich wrote:

On 10 Apr 2017, at 19:50, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 4/10/17 05:49, Stas Kelvich wrote:

Here is small patch to call statistics in logical worker. Originally i thought that stat
collection during logical replication should manually account amounts of changed tuples,
but seems that it is already smoothly handled on relation level. So call to
pgstat_report_stat() is enough.

I wonder whether we need a similar call somewhere in tablesync.c. It
seems to work without it, though.

I thought it spins up the same worker from worker.c.

It depends on which of the various tablesync scenarios happens. If the
apply lags behind the tablesync, then the apply doesn't process commit
messages until it has caught up. So the part of the code you patched
wouldn't be called.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Petr Jelinek
petr.jelinek@2ndquadrant.com
In reply to: Peter Eisentraut (#9)
1 attachment(s)
Re: logical replication worker and statistics

On 12/04/17 05:41, Peter Eisentraut wrote:

On 4/10/17 13:06, Stas Kelvich wrote:

On 10 Apr 2017, at 19:50, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 4/10/17 05:49, Stas Kelvich wrote:

Here is small patch to call statistics in logical worker. Originally i thought that stat
collection during logical replication should manually account amounts of changed tuples,
but seems that it is already smoothly handled on relation level. So call to
pgstat_report_stat() is enough.

I wonder whether we need a similar call somewhere in tablesync.c. It
seems to work without it, though.

I thought it spins up the same worker from worker.c.

It depends on which of the various tablesync scenarios happens. If the
apply lags behind the tablesync, then the apply doesn't process commit
messages until it has caught up. So the part of the code you patched
wouldn't be called.

Indeed, if catchup phase didn't happen (because tablesync was faster
than apply) then the commit handler is never called so all the changes
made by copy would be forgotten. Also the tablesync worker might exit
from process_syncing_tables() call which is called before we report
stats in the commit handler so again some changes might be forgotten.

I attached modified version of the patch that also reports stats in
finish_sync_worker() when there is outstanding transaction. The test can
stay the same.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

call_pgstat_report_statv2.difftext/x-diff; name=call_pgstat_report_statv2.diffDownload
From 231973cafe12bb948924a6380ac785d93b6af086 Mon Sep 17 00:00:00 2001
From: Petr Jelinek <pjmodos@pjmodos.net>
Date: Fri, 14 Apr 2017 17:54:03 +0200
Subject: [PATCH] Report statistics in logical replication workers

---
 src/backend/postmaster/pgstat.c             | 9 +++++----
 src/backend/replication/logical/tablesync.c | 8 +++++++-
 src/backend/replication/logical/worker.c    | 1 +
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 3fb57f0..36fedd8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -769,10 +769,11 @@ allow_immediate_pgstat_restart(void)
 /* ----------
  * pgstat_report_stat() -
  *
- *	Called from tcop/postgres.c to send the so far collected per-table
- *	and function usage statistics to the collector.  Note that this is
- *	called only when not within a transaction, so it is fair to use
- *	transaction stop time as an approximation of current time.
+ *	Must be called by processes that performs DML: tcop/postgres.c, logical
+ *	receiver processes, SPI worker, etc to send the so far collected per-table
+ *	and function usage statistics to the collector. Note that this is called
+ *	only when not within a transaction, so it is fair to use transaction stop
+ *	time as an approximation of current time.
  * ----------
  */
 void
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index d1f2734..aa1a85b 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -114,9 +114,15 @@ StringInfo	copybuf = NULL;
 static void pg_attribute_noreturn()
 finish_sync_worker(void)
 {
-	/* Commit any outstanding transaction. */
+	/*
+	 * Commit any outstanding transaction. This is the usual case, unless
+	 * there was nothing to do for the table.
+	 */
 	if (IsTransactionState())
+	{
 		CommitTransactionCommand();
+		pgstat_report_stat(false);
+	}
 
 	/* And flush all writes. */
 	XLogFlush(GetXLogWriteRecPtr());
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 3313448..169471c 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -462,6 +462,7 @@ apply_handle_commit(StringInfo s)
 	/* Process any tables that are being synchronized in parallel. */
 	process_syncing_tables(commit_data.end_lsn);
 
+	pgstat_report_stat(false);
 	pgstat_report_activity(STATE_IDLE, NULL);
 }
 
-- 
2.7.4

#11Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Petr Jelinek (#10)
Re: logical replication worker and statistics

On 4/14/17 12:09, Petr Jelinek wrote:

Indeed, if catchup phase didn't happen (because tablesync was faster
than apply) then the commit handler is never called so all the changes
made by copy would be forgotten. Also the tablesync worker might exit
from process_syncing_tables() call which is called before we report
stats in the commit handler so again some changes might be forgotten.

I attached modified version of the patch that also reports stats in
finish_sync_worker() when there is outstanding transaction. The test can
stay the same.

committed (without the tests)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers