[PATCH] Make sure all statistics is sent after a few DML are performed
Hi,
After a DML is perfomed, the statistics is sent to pgstat by pgsent_report_sent().
However, the mininum time between stas file update is PGSTAT_STAT_INTERVAL, so if
a few DMLs are performed with short interval, some statistics could not be sent
until the backend is shutdown.
This is not a problem in usual cases, but in case that a session is remained in
idle for a long time, for example when using connection pooling, statistics of
a huge change of a table is not sent for a long time, and as a result, starting
autovacuum might be delayed.
An idea to resolve this is call pgsent_report_sent() again with a delay
after the last DML to make sure to send the statistics. The attached patch
implements this.
Any comments would be appreciated.
Regards,
--
Yugo Nagata <nagata@sraoss.co.jp>
Attachments:
pgstat_report_stat.patchtext/x-diff; name=pgstat_report_stat.patchDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b8d860e..928d479 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3012,6 +3012,12 @@ ProcessInterrupts(void)
}
+ if (PgStatReportStatTimeoutPending)
+ {
+ pgstat_report_stat(false);
+ PgStatReportStatTimeoutPending = false;
+ }
+
if (ParallelMessagePending)
HandleParallelMessages();
}
@@ -4010,6 +4016,14 @@ PostgresMain(int argc, char *argv[],
ProcessCompletedNotifies();
pgstat_report_stat(false);
+ /* Call pgstat_report_stat() after PGSTAT_REPORT_STAT_DELAY
+ * again because if DMLs are performed with interval shorter
+ * than PGSTAT_STAT_INTERVAL then some statistics could not be
+ * sent until the backend is shutdown.
+ */
+ enable_timeout_after(PGSTAT_REPORT_STAT_TIMEOUT,
+ PGSTAT_REPORT_STAT_DELAY);
+
set_ps_display("idle", false);
pgstat_report_activity(STATE_IDLE, NULL);
}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 7c09498..8c29ebd 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -31,6 +31,7 @@ volatile bool QueryCancelPending = false;
volatile bool ProcDiePending = false;
volatile bool ClientConnectionLost = false;
volatile bool IdleInTransactionSessionTimeoutPending = false;
+volatile bool PgStatReportStatTimeoutPending = false;
volatile sig_atomic_t ConfigReloadPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index eb6960d..1df30bc 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -71,6 +71,7 @@ static void ShutdownPostgres(int code, Datum arg);
static void StatementTimeoutHandler(void);
static void LockTimeoutHandler(void);
static void IdleInTransactionSessionTimeoutHandler(void);
+static void PgStatReportStatTimeoutHandler(void);
static bool ThereIsAtLeastOneRole(void);
static void process_startup_options(Port *port, bool am_superuser);
static void process_settings(Oid databaseid, Oid roleid);
@@ -599,6 +600,8 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
RegisterTimeout(LOCK_TIMEOUT, LockTimeoutHandler);
RegisterTimeout(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
IdleInTransactionSessionTimeoutHandler);
+ RegisterTimeout(PGSTAT_REPORT_STAT_TIMEOUT,
+ PgStatReportStatTimeoutHandler);
}
/*
@@ -1196,6 +1199,14 @@ IdleInTransactionSessionTimeoutHandler(void)
SetLatch(MyLatch);
}
+static void
+PgStatReportStatTimeoutHandler(void)
+{
+ PgStatReportStatTimeoutPending = true;
+ InterruptPending = true;
+ SetLatch(MyLatch);
+}
+
/*
* Returns true if at least one role is defined in this database cluster.
*/
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index dad98de..4bc6f0f 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -81,6 +81,7 @@ extern PGDLLIMPORT volatile bool InterruptPending;
extern PGDLLIMPORT volatile bool QueryCancelPending;
extern PGDLLIMPORT volatile bool ProcDiePending;
extern PGDLLIMPORT volatile bool IdleInTransactionSessionTimeoutPending;
+extern PGDLLIMPORT volatile bool PgStatReportStatTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ConfigReloadPending;
extern volatile bool ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 6bffe63..a06703f 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -33,6 +33,11 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
+/* How long to wait before calling pgstat_stat_report after
+ * the last DML is performed; in milliseconds.
+ */
+#define PGSTAT_REPORT_STAT_DELAY 500
+
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 5a2efc0..7eccd2d 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -31,6 +31,7 @@ typedef enum TimeoutId
STANDBY_TIMEOUT,
STANDBY_LOCK_TIMEOUT,
IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
+ PGSTAT_REPORT_STAT_TIMEOUT,
/* First user-definable timeout reason */
USER_TIMEOUT,
/* Maximum number of timeout reasons */
Hi,
On 2017-07-18 21:49:27 +0900, Yugo Nagata wrote:
After a DML is perfomed, the statistics is sent to pgstat by pgsent_report_sent().
However, the mininum time between stas file update is PGSTAT_STAT_INTERVAL, so if
a few DMLs are performed with short interval, some statistics could not be sent
until the backend is shutdown.This is not a problem in usual cases, but in case that a session is remained in
idle for a long time, for example when using connection pooling, statistics of
a huge change of a table is not sent for a long time, and as a result, starting
autovacuum might be delayed.
Hm.
An idea to resolve this is call pgsent_report_sent() again with a delay
after the last DML to make sure to send the statistics. The attached patch
implements this.
That seems like it'd add a good number of new wakeups, or at least
scheduling of wakeups. Now a timer would be armed whenever a query is
run outside of a transaction - something happening quite regularly. And
it'd do so even if a) there are no outstanding stat reports b) there's
already a timer running. Additionally it'd, unless I mis-skimmed,
trigger pgstat_report_stat() from within
CHECK_FOR_INTERRUPTS/ProcessInterrupts even when inside a transaction,
if that transaction is started after the enable_timeout_after().
I'm not entirely sure what a good solution would be. One option would
be to have a regular wakeup setting a flag (considerably longer than
500ms, that's way too many additional wakeups), rearm the timer in the
handler, but only process the flag when outside of a transaction.
Or we could do nothing - I actually think that's a viable option.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
That seems like it'd add a good number of new wakeups, or at least
scheduling of wakeups.
Yes, as it stands this will result in a huge increase in alarm-scheduling
kernel call traffic. I understand the issue but I do not think this is
an acceptable path to a fix.
Or we could do nothing - I actually think that's a viable option.
I tend to agree. I'm not really sure that the presented problem is a
big deal: for it to be an issue, you have to assume that a DML operation
that takes less than PGSTAT_STAT_INTERVAL is capable of causing enough
table churn that it's a problem if autovacuum doesn't hear about that
churn promptly.
I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.
I don't think that value has been reconsidered since the code was written,
circa turn of the century. Maybe even make it configurable, though that
could be overkill.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2017-07-18 09:42:31 -0400, Tom Lane wrote:
I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.
I don't think that value has been reconsidered since the code was written,
circa turn of the century. Maybe even make it configurable, though that
could be overkill.
Not sure if that really does that much to solve the concern. Another,
pretty half-baked, approach would be to add a procsignal triggering idle
backends to send stats, and send that to all idle backends when querying
stats. We could even publish the number of outstanding stats updates in
PGXACT or such, without any locking, and send it only to those that have
outstanding ones.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
On 2017-07-18 09:42:31 -0400, Tom Lane wrote:
I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.
Not sure if that really does that much to solve the concern.
Well, it reduces the amount of data churn that a statement shorter than
PGSTAT_STAT_INTERVAL could cause.
Another,
pretty half-baked, approach would be to add a procsignal triggering idle
backends to send stats, and send that to all idle backends when querying
stats. We could even publish the number of outstanding stats updates in
PGXACT or such, without any locking, and send it only to those that have
outstanding ones.
If somebody wanted to do the work, that'd be a viable answer IMO. You'd
really want to not wake backends that have nothing more to send, but
I agree that it'd be possible to advertise that in shared memory.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, 18 Jul 2017 10:10:49 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thank you for your comments. I understand the problem of my proposal
patch.
Andres Freund <andres@anarazel.de> writes:
On 2017-07-18 09:42:31 -0400, Tom Lane wrote:
I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.
Not sure if that really does that much to solve the concern.
Well, it reduces the amount of data churn that a statement shorter than
PGSTAT_STAT_INTERVAL could cause.Another,
pretty half-baked, approach would be to add a procsignal triggering idle
backends to send stats, and send that to all idle backends when querying
stats. We could even publish the number of outstanding stats updates in
PGXACT or such, without any locking, and send it only to those that have
outstanding ones.If somebody wanted to do the work, that'd be a viable answer IMO. You'd
really want to not wake backends that have nothing more to send, but
I agree that it'd be possible to advertise that in shared memory.regards, tom lane
--
Yugo Nagata <nagata@sraoss.co.jp>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
(please don't top-reply on this list)
On 2017-07-19 14:04:39 +0900, Yugo Nagata wrote:
On Tue, 18 Jul 2017 10:10:49 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:Thank you for your comments. I understand the problem of my proposal
patch.
Does that mean you're trying to rewrite it in the way that was
suggested:
Another,
pretty half-baked, approach would be to add a procsignal triggering idle
backends to send stats, and send that to all idle backends when querying
stats. We could even publish the number of outstanding stats updates in
PGXACT or such, without any locking, and send it only to those that have
outstanding ones.If somebody wanted to do the work, that'd be a viable answer IMO. You'd
really want to not wake backends that have nothing more to send, but
I agree that it'd be possible to advertise that in shared memory.
or are you planning to just let the issue leave be?
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, 21 Jul 2017 04:58:47 -0700
Andres Freund <andres@anarazel.de> wrote:
Hi,
(please don't top-reply on this list)
On 2017-07-19 14:04:39 +0900, Yugo Nagata wrote:
On Tue, 18 Jul 2017 10:10:49 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:Thank you for your comments. I understand the problem of my proposal
patch.Does that mean you're trying to rewrite it in the way that was
suggested:
Not yet, but I'll try to do it.
Another,
pretty half-baked, approach would be to add a procsignal triggering idle
backends to send stats, and send that to all idle backends when querying
stats. We could even publish the number of outstanding stats updates in
PGXACT or such, without any locking, and send it only to those that have
outstanding ones.If somebody wanted to do the work, that'd be a viable answer IMO. You'd
really want to not wake backends that have nothing more to send, but
I agree that it'd be possible to advertise that in shared memory.or are you planning to just let the issue leave be?
- Andres
--
Yugo Nagata <nagata@sraoss.co.jp>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers